提交 · 3cb5bf1bf947d325fcf6e9458952b51cfd7e6677 · openanolis / cloud-kernel

30 4月, 2013 1 次提交

proc: Delete create_proc_read_entry() · 3cb5bf1b

由 David Howells 提交于 4月 11, 2013

Delete create_proc_read_entry() as it no longer has any users.

Also delete read_proc_t, write_proc_t, the read_proc member of the
proc_dir_entry struct and the support functions that use them.  This saves a
pointer for every PDE allocated.
Signed-off-by: NDavid Howells <dhowells@redhat.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

3cb5bf1b

10 4月, 2013 11 次提交

A
try a saner locking for pde_opener... · 05c0ae21
由 Al Viro 提交于 4月 04, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
05c0ae21

deal with races between remove_proc_entry() and proc_reg_release() · ca469f35

由 Al Viro 提交于 4月 03, 2013

* serialize the call of ->release() on per-pdeo mutex
* don't remove pdeo from per-pde list until we are through with it
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

ca469f35

procfs: preparations for remove_proc_entry() race fixes · 866ad9a7

由 Al Viro 提交于 4月 03, 2013

* leave ->proc_fops alone; make ->pde_users negative instead
* trim pde_opener
* move relevant code in fs/proc/inode.c
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

866ad9a7

procfs: Clean up huge if-statement in __proc_file_read() · ad147d01

由 David Howells 提交于 4月 04, 2013

Switch huge if-statement in __proc_file_read() around. This then puts the
single line loop break immediately after the if-statement and allows us to
de-indent the huge comment and make it take fewer lines. The code following
the if-statement then follows naturally from the call to dp->read_proc().
Signed-off-by: NDavid Howells <dhowells@redhat.com>

ad147d01

proc: Kill create_proc_entry() · 80e928f7

由 David Howells 提交于 4月 04, 2013

Kill create_proc_entry() in favour of create_proc_read_entry(), proc_create()
and proc_create_data().
Signed-off-by: NDavid Howells <dhowells@redhat.com>

80e928f7

procfs: new helper - PDE_DATA(inode) · d9dda78b

由 Al Viro 提交于 3月 31, 2013

The only part of proc_dir_entry the code outside of fs/proc
really cares about is PDE(inode)->data.  Provide a helper
for that; static inline for now, eventually will be moved
to fs/proc, along with the knowledge of struct proc_dir_entry
layout.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d9dda78b

A
procfs: kill ->write_proc() · ee21ed0a
由 Al Viro 提交于 3月 31, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
ee21ed0a
A
procfs: don't allow to use proc_create, create_proc_entry, etc. for directories · b6cdc731
由 Al Viro 提交于 3月 30, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
b6cdc731

procfs: switch /proc/self away from proc_dir_entry · 021ada7d

由 Al Viro 提交于 3月 29, 2013

Just have it pinned in dcache all along and let procfs ->kill_sb()
drop it before kill_anon_super().
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

021ada7d

A
mode_t, whack-a-mole at 11... · 0ecc833b
由 Al Viro 提交于 3月 29, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
0ecc833b

procfs: add proc_remove_subtree() · 8ce584c7

由 Al Viro 提交于 3月 30, 2013

just what it sounds like; do that only to procfs subtrees you've
created - doing that to something shared with another driver is
not only antisocial, but might cause interesting races with
proc_create() and its ilk.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

8ce584c7

23 3月, 2013 1 次提交

vfs,proc: guarantee unique inodes in /proc · 51f0885e

由 Linus Torvalds 提交于 3月 22, 2013

Dave Jones found another /proc issue with his Trinity tool: thanks to
the namespace model, we can have multiple /proc dentries that point to
the same inode, aliasing directories in /proc/<pid>/net/ for example.

This ends up being a total disaster, because it acts like hardlinked
directories, and causes locking problems. We rely on the topological
sort of the inodes pointed to by dentries, and if we have aliased
directories, that odering becomes unreliable.

In short: don't do this. Multiple dentries with the same (directory)
inode is just a bad idea, and the namespace code should never have
exposed things this way. But we're kind of stuck with it.

This solves things by just always allocating a new inode during /proc
dentry lookup, instead of using "iget_locked()" to look up existing
inodes by superblock and number. That actually simplies the code a bit,
at the cost of potentially doing more inode [de]allocations.

That said, the inode lookup wasn't free either (and did a lot of locking
of inodes), so it is probably not that noticeable. We could easily keep
the old lookup model for non-directory entries, but rather than try to
be excessively clever this just implements the minimal and simplest
workaround for the problem.
Reported-and-tested-by: NDave Jones <davej@redhat.com>
Analyzed-by: NAl Viro <viro@zeniv.linux.org.uk>
Cc: stable@vger.kernel.org
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

51f0885e

09 3月, 2013 1 次提交

proc: Use nd_jump_link in proc_ns_follow_link · db04dc67

由 Eric W. Biederman 提交于 3月 09, 2013

Update proc_ns_follow_link to use nd_jump_link instead of just
manually updating nd.path.dentry.

This fixes the BUG_ON(nd->inode != parent->d_inode) reported by Dave
Jones and reproduced trivially with mkdir /proc/self/ns/uts/a.

Sigh it looks like the VFS change to require use of nd_jump_link
happend while proc_ns_follow_link was baking and since the common case
of proc_ns_follow_link continued to work without problems the need for
making this change was overlooked.

Cc: stable@vger.kernel.org
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

db04dc67

28 2月, 2013 3 次提交

fs/proc/vmcore.c: put if tests in the top of the while loop to reduce duplication · c2c1b089

由 Zhang Yanfei 提交于 2月 27, 2013

In read_vmcore() two `if' tests are duplicated.  Change the position of
them could reduce the duplication.  This change does not affect the
behaviour of the function.

[akpm@linux-foundation.org: avoid `if (foo = bar)' thing, use min_t()]
[akpm@linux-foundation.org: s/max_t/min_t/]
Signed-off-by: NZhang Yanfei <zhangyanfei@cn.fujitsu.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

c2c1b089

fs/proc: clean up printks · 87ebdc00

由 Andrew Morton 提交于 2月 27, 2013

- use pr_foo() throughout

- remove a couple of duplicated KERN_WARNINGs, via WARN(KERN_WARNING "...")

- nuke a few warnings which I've never seen happen, ever.

Cc: Joe Perches <joe@perches.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

87ebdc00

coredump: remove redundant defines for dumpable states · e579d2c2

由 Kees Cook 提交于 2月 27, 2013

The existing SUID_DUMP_* defines duplicate the newer SUID_DUMPABLE_*
defines introduced in 54b50199 ("coredump: warn about unsafe
suid_dumpable / core_pattern combo").  Remove the new ones, and use the
prior values instead.
Signed-off-by: NKees Cook <keescook@chromium.org>
Reported-by: NChen Gang <gang.chen@asianux.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alan Cox <alan@linux.intel.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Doug Ledford <dledford@redhat.com>
Cc: Serge Hallyn <serge.hallyn@canonical.com>
Cc: James Morris <james.l.morris@oracle.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

e579d2c2

26 2月, 2013 4 次提交

saner proc_get_inode() calling conventions · d3d009cb

由 Al Viro 提交于 1月 25, 2013

Make it drop the pde in *all* cases when no new reference to it is
put into an inode - both when an inode had already been set up
(as we were already doing) and when inode allocation has failed.
Makes for simpler logics in callers...
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

d3d009cb

proc: avoid extra pde_put() in proc_fill_super() · 87e0aab3

由 Maxim Patlasov 提交于 2月 20, 2013

If proc_get_inode() succeeded, but d_make_root() failed, pde_put() for
proc_root will be called twice: the first time due to iput() called from
d_make_root() and the second time directly in the end of
proc_fill_super().
Signed-off-by: NMaxim Patlasov <mpatlasov@parallels.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

87e0aab3

fs: change return values from -EACCES to -EPERM · 41735818

由 Zhao Hongjiang 提交于 2月 20, 2013

According to SUSv3:

[EACCES] Permission denied. An attempt was made to access a file in a way
forbidden by its file access permissions.

[EPERM] Operation not permitted. An attempt was made to perform an operation
limited to processes with appropriate privileges or to the owner of a file
or other resource.

So -EPERM should be returned if capability checks fails.

Strictly speaking this is an API change since the error code user sees is
altered.
Signed-off-by: NZhao Hongjiang <zhaohongjiang@huawei.com>
Acked-by: NJan Kara <jack@suse.cz>
Acked-by: NSteven Whitehouse <swhiteho@redhat.com>
Acked-by: NIan Kent <raven@themaw.net>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

41735818

d_hash_and_lookup(): export, switch open-coded instances · 4f522a24

由 Al Viro 提交于 2月 11, 2013

* calling conventions change - ERR_PTR() is returned on ->d_hash() errors;
NULL is just for dcache miss now.
* exported, open-coded instances in ncpfs and cifs converted.
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

4f522a24

24 2月, 2013 2 次提交

swap: make each swap partition have one address_space · 33806f06

由 Shaohua Li 提交于 2月 22, 2013

When I use several fast SSD to do swap, swapper_space.tree_lock is
heavily contended.  This makes each swap partition have one
address_space to reduce the lock contention.  There is an array of
address_space for swap.  The swap entry type is the index to the array.

In my test with 3 SSD, this increases the swapout throughput 20%.

[akpm@linux-foundation.org: revert unneeded change to  __add_to_swap_cache]
Signed-off-by: NShaohua Li <shli@fusionio.com>
Cc: Hugh Dickins <hughd@google.com>
Acked-by: NRik van Riel <riel@redhat.com>
Acked-by: NMinchan Kim <minchan@kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

33806f06

memory-failure: use num_poisoned_pages instead of mce_bad_pages · 293c07e3

由 Xishi Qiu 提交于 2月 22, 2013

Since MCE is an x86 concept, and this code is in mm/, it would be better
to use the name num_poisoned_pages instead of mce_bad_pages.

[akpm@linux-foundation.org: fix mm/sparse.c]
Signed-off-by: NXishi Qiu <qiuxishi@huawei.com>
Signed-off-by: NJiang Liu <jiang.liu@huawei.com>
Suggested-by: NBorislav Petkov <bp@alien8.de>
Reviewed-by: NWanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

293c07e3

23 2月, 2013 1 次提交
- A
  new helper: file_inode(file) · 496ad9aa
  由 Al Viro 提交于 1月 23, 2013
```
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
```
  496ad9aa
19 2月, 2013 2 次提交

net: proc: remove proc_net_remove · c2399059

由 Gao feng 提交于 2月 18, 2013

proc_net_remove has been replaced by remove_proc_entry.
we can remove it now.
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

c2399059

net: proc: remove proc_net_fops_create · b4278c96

由 Gao feng 提交于 2月 18, 2013

proc_net_fops_create has been replaced by proc_create,
we can remove it now.
Signed-off-by: NGao feng <gaofeng@cn.fujitsu.com>
Signed-off-by: NDavid S. Miller <davem@davemloft.net>

b4278c96

28 1月, 2013 1 次提交

cputime: Use accessors to read task cputime stats · 6fac4829

由 Frederic Weisbecker 提交于 11月 13, 2012

This is in preparation for the full dynticks feature. While
remotely reading the cputime of a task running in a full
dynticks CPU, we'll need to do some extra-computation. This
way we can account the time it spent tickless in userspace
since its last cputime snapshot.
Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Li Zhong <zhong@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>

6fac4829

19 1月, 2013 1 次提交

tty: Added a CONFIG_TTY option to allow removal of TTY · 4f73bc4d

由 Joe Millenbach 提交于 1月 17, 2013

The option allows you to remove TTY and compile without errors. This
saves space on systems that won't support TTY interfaces anyway.
bloat-o-meter output is below.

The bulk of this patch consists of Kconfig changes adding "depends on
TTY" to various serial devices and similar drivers that require the TTY
layer.  Ideally, these dependencies would occur on a common intermediate
symbol such as SERIO, but most drivers "select SERIO" rather than
"depends on SERIO", and "select" does not respect dependencies.

bloat-o-meter output comparing our previous minimal to new minimal by
removing TTY.  The list is filtered to not show removed entries with awk
'$3 != "-"' as the list was very long.

add/remove: 0/226 grow/shrink: 2/14 up/down: 6/-35356 (-35350)
function                                     old     new   delta
chr_dev_init                                 166     170      +4
allow_signal                                  80      82      +2
static.__warned                              143     142      -1
disallow_signal                               63      62      -1
__set_special_pids                            95      94      -1
unregister_console                           126     121      -5
start_kernel                                 546     541      -5
register_console                             593     588      -5
copy_from_user                                45      40      -5
sys_setsid                                   128     120      -8
sys_vhangup                                   32      19     -13
do_exit                                     1543    1526     -17
bitmap_zero                                   60      40     -20
arch_local_irq_save                          137     117     -20
release_task                                 674     652     -22
static.spin_unlock_irqrestore                308     260     -48
Signed-off-by: NJoe Millenbach <jmillenbach@gmail.com>
Reviewed-by: NJamey Sharp <jamey@minilop.net>
Reviewed-by: NJosh Triplett <josh@joshtriplett.org>
Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>

4f73bc4d

03 1月, 2013 1 次提交

mempolicy: remove arg from mpol_parse_str, mpol_to_str · a7a88b23

由 Hugh Dickins 提交于 1月 02, 2013

Remove the unused argument (formerly no_context) from mpol_parse_str()
and from mpol_to_str().
Signed-off-by: NHugh Dickins <hughd@google.com>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

a7a88b23

26 12月, 2012 1 次提交

proc: Allow proc_free_inum to be called from any context · dfb2ea45

由 Eric W. Biederman 提交于 12月 21, 2012

While testing the pid namespace code I hit this nasty warning.

[  176.262617] ------------[ cut here ]------------
[  176.263388] WARNING: at /home/eric/projects/linux/linux-userns-devel/kernel/softirq.c:160 local_bh_enable_ip+0x7a/0xa0()
[  176.265145] Hardware name: Bochs
[  176.265677] Modules linked in:
[  176.266341] Pid: 742, comm: bash Not tainted 3.7.0userns+ #18
[  176.266564] Call Trace:
[  176.266564]  [<ffffffff810a539f>] warn_slowpath_common+0x7f/0xc0
[  176.266564]  [<ffffffff810a53fa>] warn_slowpath_null+0x1a/0x20
[  176.266564]  [<ffffffff810ad9ea>] local_bh_enable_ip+0x7a/0xa0
[  176.266564]  [<ffffffff819308c9>] _raw_spin_unlock_bh+0x19/0x20
[  176.266564]  [<ffffffff8123dbda>] proc_free_inum+0x3a/0x50
[  176.266564]  [<ffffffff8111d0dc>] free_pid_ns+0x1c/0x80
[  176.266564]  [<ffffffff8111d195>] put_pid_ns+0x35/0x50
[  176.266564]  [<ffffffff810c608a>] put_pid+0x4a/0x60
[  176.266564]  [<ffffffff8146b177>] tty_ioctl+0x717/0xc10
[  176.266564]  [<ffffffff810aa4d5>] ? wait_consider_task+0x855/0xb90
[  176.266564]  [<ffffffff81086bf9>] ? default_spin_lock_flags+0x9/0x10
[  176.266564]  [<ffffffff810cab0a>] ? remove_wait_queue+0x5a/0x70
[  176.266564]  [<ffffffff811e37e8>] do_vfs_ioctl+0x98/0x550
[  176.266564]  [<ffffffff810b8a0f>] ? recalc_sigpending+0x1f/0x60
[  176.266564]  [<ffffffff810b9127>] ? __set_task_blocked+0x37/0x80
[  176.266564]  [<ffffffff810ab95b>] ? sys_wait4+0xab/0xf0
[  176.266564]  [<ffffffff811e3d31>] sys_ioctl+0x91/0xb0
[  176.266564]  [<ffffffff810a95f0>] ? task_stopped_code+0x50/0x50
[  176.266564]  [<ffffffff81939199>] system_call_fastpath+0x16/0x1b
[  176.266564] ---[ end trace 387af88219ad6143 ]---

It turns out that spin_unlock_bh(proc_inum_lock) is not safe when
put_pid is called with another spinlock held and irqs disabled.

For now take the easy path and use spin_lock_irqsave(proc_inum_lock)
in proc_free_inum and spin_loc_irq in proc_alloc_inum(proc_inum_lock).
Signed-off-by: N"Eric W. Biederman" <ebiederm@xmission.com>

dfb2ea45

21 12月, 2012 2 次提交

proc: fix inconsistent lock state · ee297209

由 Xiaotian Feng 提交于 12月 20, 2012

Lockdep found an inconsistent lock state when rcu is processing delayed
work in softirq.  Currently, kernel is using spin_lock/spin_unlock to
protect proc_inum_ida, but proc_free_inum is called by rcu in softirq
context.

Use spin_lock_bh/spin_unlock_bh fix following lockdep warning.

  =================================
  [ INFO: inconsistent lock state ]
  3.7.0 #36 Not tainted
  ---------------------------------
  inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
  swapper/1/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
   (proc_inum_lock){+.?...}, at: proc_free_inum+0x1c/0x50
  {SOFTIRQ-ON-W} state was registered at:
     __lock_acquire+0x8ae/0xca0
     lock_acquire+0x199/0x200
     _raw_spin_lock+0x41/0x50
     proc_alloc_inum+0x4c/0xd0
     alloc_mnt_ns+0x49/0xc0
     create_mnt_ns+0x25/0x70
     mnt_init+0x161/0x1c7
     vfs_caches_init+0x107/0x11a
     start_kernel+0x348/0x38c
     x86_64_start_reservations+0x131/0x136
     x86_64_start_kernel+0x103/0x112
  irq event stamp: 2993422
  hardirqs last  enabled at (2993422):  _raw_spin_unlock_irqrestore+0x55/0x80
  hardirqs last disabled at (2993421):  _raw_spin_lock_irqsave+0x29/0x70
  softirqs last  enabled at (2993394):  _local_bh_enable+0x13/0x20
  softirqs last disabled at (2993395):  call_softirq+0x1c/0x30

  other info that might help us debug this:
   Possible unsafe locking scenario:

         CPU0
         ----
    lock(proc_inum_lock);
    <Interrupt>
      lock(proc_inum_lock);

   *** DEADLOCK ***

  no locks held by swapper/1/0.

  stack backtrace:
  Pid: 0, comm: swapper/1 Not tainted 3.7.0 #36
  Call Trace:
   <IRQ>  [<ffffffff810a40f1>] ? vprintk_emit+0x471/0x510
    print_usage_bug+0x2a5/0x2c0
    mark_lock+0x33b/0x5e0
    __lock_acquire+0x813/0xca0
    lock_acquire+0x199/0x200
    _raw_spin_lock+0x41/0x50
    proc_free_inum+0x1c/0x50
    free_pid_ns+0x1c/0x50
    put_pid_ns+0x2e/0x50
    put_pid+0x4a/0x60
    delayed_put_pid+0x12/0x20
    rcu_process_callbacks+0x462/0x790
    __do_softirq+0x1b4/0x3b0
    call_softirq+0x1c/0x30
    do_softirq+0x59/0xd0
    irq_exit+0x54/0xd0
    smp_apic_timer_interrupt+0x95/0xa3
    apic_timer_interrupt+0x72/0x80
    cpuidle_enter_tk+0x10/0x20
    cpuidle_enter_state+0x17/0x50
    cpuidle_idle_call+0x287/0x520
    cpu_idle+0xba/0x130
    start_secondary+0x2b3/0x2bc
Signed-off-by: NXiaotian Feng <dannyfeng@tencent.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

ee297209

procfs: drop vmtruncate · 46f69557

由 Marco Stornelli 提交于 12月 15, 2012

Removed vmtruncate
Signed-off-by: NMarco Stornelli <marco.stornelli@gmail.com>
Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>

46f69557

18 12月, 2012 7 次提交

fs, epoll: add procfs fdinfo helper · 138d22b5

由 Cyrill Gorcunov 提交于 12月 17, 2012

This allows us to print out eventpoll target file descriptor, events and
data, the /proc/pid/fdinfo/fd consists of

 | pos:	0
 | flags:	02
 | tfd:        5 events:       1d data: ffffffffffffffff enabled: 1

[avagin@: fix for unitialized ret variable]
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

138d22b5

procfs: add ability to plug in auxiliary fdinfo providers · 55985dd7

由 Cyrill Gorcunov 提交于 12月 17, 2012

This patch brings ability to print out auxiliary data associated with
file in procfs interface /proc/pid/fdinfo/fd.

In particular further patches make eventfd, evenpoll, signalfd and
fsnotify to print additional information complete enough to restore
these objects after checkpoint.

To simplify the code we add show_fdinfo callback inside struct
file_operations (as Al and Pavel are proposing).
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Acked-by: NPavel Emelyanov <xemul@parallels.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrey Vagin <avagin@openvz.org>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: James Bottomley <jbottomley@parallels.com>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Matthew Helsley <matt.helsley@gmail.com>
Cc: "J. Bruce Fields" <bfields@fieldses.org>
Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@onelan.co.uk>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

55985dd7

proc: pid/status: show all supplementary groups · 8d238027

由 Artem Bityutskiy 提交于 12月 17, 2012

We display a list of supplementary group for each process in
/proc/<pid>/status.  However, we show only the first 32 groups, not all of
them.

Although this is rare, but sometimes processes do have more than 32
supplementary groups, and this kernel limitation breaks user-space apps
that rely on the group list in /proc/<pid>/status.

Number 32 comes from the internal NGROUPS_SMALL macro which defines the
length for the internal kernel "small" groups buffer.  There is no
apparent reason to limit to this value.

This patch removes the 32 groups printing limit.

The Linux kernel limits the amount of supplementary groups by NGROUPS_MAX,
which is currently set to 65536.  And this is the maximum count of groups
we may possibly print.
Signed-off-by: NArtem Bityutskiy <artem.bityutskiy@linux.intel.com>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Acked-by: NKees Cook <keescook@chromium.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

8d238027

/proc/pid/status: add "Seccomp" field · 2f4b3bf6

由 Kees Cook 提交于 12月 17, 2012

It is currently impossible to examine the state of seccomp for a given
process.  While attaching with gdb and attempting "call
prctl(PR_GET_SECCOMP,...)" will work with some situations, it is not
reliable.  If the process is in seccomp mode 1, this query will kill the
process (prctl not allowed), if the process is in mode 2 with prctl not
allowed, it will similarly be killed, and in weird cases, if prctl is
filtered to return errno 0, it can look like seccomp is disabled.

When reviewing the state of running processes, there should be a way to
externally examine the seccomp mode.  ("Did this build of Chrome end up
using seccomp?" "Did my distro ship ssh with seccomp enabled?")

This adds the "Seccomp" line to /proc/$pid/status.
Signed-off-by: NKees Cook <keescook@chromium.org>
Reviewed-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: James Morris <jmorris@namei.org>
Acked-by: NSerge E. Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

2f4b3bf6

procfs: add VmFlags field in smaps output · 834f82e2

由 Cyrill Gorcunov 提交于 12月 17, 2012

During c/r sessions we've found that there is no way at the moment to
fetch some VMA associated flags, such as mlock() and madvise().

This leads us to a problem -- we don't know if we should call for mlock()
and/or madvise() after restore on the vma area we're bringing back to
life.

This patch intorduces a new field into "smaps" output called VmFlags,
where all set flags associated with the particular VMA is shown as two
letter mnemonics.

[ Strictly speaking for c/r we only need mlock/madvise bits but it has been
  said that providing just a few flags looks somehow inconsistent.  So all
  flags are here now. ]

This feature is made available on CONFIG_CHECKPOINT_RESTORE=n kernels, as
other applications may start to use these fields.

The data is encoded in a somewhat awkward two letters mnemonic form, to
encourage userspace to be prepared for fields being added or removed in
the future.

[a.p.zijlstra@chello.nl: props to use for_each_set_bit]
[sfr@canb.auug.org.au: props to use array instead of struct]
[akpm@linux-foundation.org: overall redesign and simplification]
[akpm@linux-foundation.org: remove unneeded braces per sfr, avoid using bloaty for_each_set_bit()]
Signed-off-by: NCyrill Gorcunov <gorcunov@openvz.org>
Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

834f82e2

proc: don't show nonexistent capabilities · 7b9a7ec5

由 Andrew Vagin 提交于 12月 17, 2012

Without this patch it is really hard to interpret a bounding set, if
CAP_LAST_CAP is unknown for a current kernel.

Non-existant capabilities can not be deleted from a bounding set with help
of prctl.

E.g.: Here are two examples without/with this patch.

  CapBnd:	ffffffe0fdecffff
  CapBnd:	00000000fdecffff

I suggest to hide non-existent capabilities. Here is two reasons.
* It's logically and easier for using.
* It helps to checkpoint-restore capabilities of tasks, because tasks
can be restored on another kernel, where CAP_LAST_CAP is bigger.
Signed-off-by: NAndrew Vagin <avagin@openvz.org>
Cc: Andrew G. Morgan <morgan@kernel.org>
Reviewed-by: NSerge E. Hallyn <serge.hallyn@canonical.com>
Cc: Pavel Emelyanov <xemul@parallels.com>
Reviewed-by: NKees Cook <keescook@chromium.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: James Morris <jmorris@namei.org>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

7b9a7ec5

procfs: use kbasename() · f9a00e87

由 Andy Shevchenko 提交于 12月 17, 2012

[yongjun_wei@trendmicro.com.cn: remove duplicated include]
Signed-off-by: NAndy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: NWei Yongjun <yongjun_wei@trendmicro.com.cn>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

f9a00e87

13 12月, 2012 1 次提交

procfs: use N_MEMORY instead N_HIGH_MEMORY · 4ff1b2c2

由 Lai Jiangshan 提交于 12月 12, 2012

N_HIGH_MEMORY stands for the nodes that has normal or high memory.
N_MEMORY stands for the nodes that has any memory.

The code here need to handle with the nodes which have memory, we should
use N_MEMORY instead.
Signed-off-by: NLai Jiangshan <laijs@cn.fujitsu.com>
Acked-by: NHillf Danton <dhillf@gmail.com>
Signed-off-by: NWen Congyang <wency@cn.fujitsu.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Lin Feng <linfeng@cn.fujitsu.com>
Cc: David Rientjes <rientjes@google.com>
Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>

4ff1b2c2

openanolis / cloud-kernel 1 年多 前同步成功

openanolis / cloud-kernel
1 年多前同步成功