1. 02 5月, 2008 1 次提交
  2. 30 4月, 2008 4 次提交
  3. 29 4月, 2008 16 次提交
    • P
      sysctl: add the ->permissions callback on the ctl_table_root · d7321cd6
      Pavel Emelyanov 提交于
      When reading from/writing to some table, a root, which this table came from,
      may affect this table's permissions, depending on who is working with the
      table.
      
      The core hunk is at the bottom of this patch.  All the rest is just pushing
      the ctl_table_root argument up to the sysctl_perm() function.
      
      This will be mostly (only?) used in the net sysctls.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Denis V. Lunev <den@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7321cd6
    • P
      sysctl: merge equal proc_sys_read and proc_sys_write · 7708bfb1
      Pavel Emelyanov 提交于
      Many (most of) sysctls do not have a per-container sense.  E.g.
      kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget and so on
      and so forth.  Besides, tuning then from inside a container is not even
      secure.  On the other hand, hiding them completely from the container's tasks
      sometimes causes user-space to stop working.
      
      When developing net sysctl, the common practice was to duplicate a table and
      drop the write bits in table->mode, but this approach was not very elegant,
      lead to excessive memory consumption and was not suitable in general.
      
      Here's the alternative solution.  To facilitate the per-container sysctls
      ctl_table_root-s were introduced.  Each root contains a list of
      ctl_table_header-s that are visible to different namespaces.  The idea of this
      set is to add the permissions() callback on the ctl_table_root to allow ctl
      root limit permissions to the same ctl_table-s.
      
      The main user of this functionality is the net-namespaces code, but later this
      will (should) be used by more and more namespaces, containers and control
      groups.
      
      Actually, this idea's core is in a single hunk in the third patch.  First two
      patches are cleanups for sysctl code, while the third one mostly extends the
      arguments set of some sysctl functions.
      
      This patch:
      
      These ->read and ->write callbacks act in a very similar way, so merge these
      paths to reduce the number of places to patch later and shrink the .text size
      (a bit).
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Denis V. Lunev <den@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7708bfb1
    • D
      proc: introduce proc_create_data to setup de->data · 59b74351
      Denis V. Lunev 提交于
      This set of patches fixes an proc ->open'less usage due to ->proc_fops flip in
      the most part of the kernel code.  The original OOPS is described in the
      commit 2d3a4e36:
      
          Typical PDE creation code looks like:
      
          	pde = create_proc_entry("foo", 0, NULL);
          	if (pde)
          		pde->proc_fops = &foo_proc_fops;
      
          Notice that PDE is first created, only then ->proc_fops is set up to
          final value. This is a problem because right after creation
          a) PDE is fully visible in /proc , and
          b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's
             possible to ->read without ->open (see one class of oopses below).
      
          The fix is new API called proc_create() which makes sure ->proc_fops are
          set up before gluing PDE to main tree. Typical new code looks like:
      
          	pde = proc_create("foo", 0, NULL, &foo_proc_fops);
          	if (!pde)
          		return -ENOMEM;
      
          Fix most networking users for a start.
      
          In the long run, create_proc_entry() for regular files will go.
      
      In addition to this, proc_create_data is introduced to fix reading from
      proc without PDE->data. The race is basically the same as above.
      
      create_proc_entries is replaced in the entire kernel code as new method
      is also simply better.
      
      This patch:
      
      The problem is the same as for de->proc_fops.  Right now PDE becomes visible
      without data set.  So, the entry could be looked up without data.  This, in
      most cases, will simply OOPS.
      
      proc_create_data call is created to address this issue.  proc_create now
      becomes a wrapper around it.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Dmitry Torokhov <dtor@mail.ru>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Jaroslav Kysela <perex@suse.cz>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Karsten Keil <kkeil@suse.de>
      Cc: Kyle McMartin <kyle@parisc-linux.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Osterlund <petero2@telia.com>
      Cc: Pierre Peiffer <peifferp@gmail.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      59b74351
    • A
      proc: convert /proc/tty/ldiscs to seq_file interface · b640a89d
      Alexey Dobriyan 提交于
      Note: THIS_MODULE and header addition aren't technically needed because
            this code is not modular, but let's keep it anyway because people
            can copy this code into modular code.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b640a89d
    • A
      proc: remove ->get_info infrastructure · 8731f14d
      Alexey Dobriyan 提交于
      Now that last dozen or so users of ->get_info were removed, ditch it too.
      Everyone sane shouldd have switched to seq_file interface long ago.
      
      P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc
            is long-standing crap, BTW, thus
            a) put ->read_proc/->write_proc/read_proc_entry() users on death row,
            b) new such users should be rejected,
            c) everyone is encouraged to convert his favourite ->read_proc user or
               I'll do it, lazy bastards.
      Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8731f14d
    • A
      proc: remove proc_root from drivers · c74c120a
      Alexey Dobriyan 提交于
      Remove proc_root export.  Creation and removal works well if parent PDE is
      supplied as NULL -- it worked always that way.
      
      So, one useless export removed and consistency added, some drivers created
      PDEs with &proc_root as parent but removed them as NULL and so on.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c74c120a
    • A
      proc: remove proc_root_driver · 928b4d8c
      Alexey Dobriyan 提交于
      Use creation by full path: "driver/foo".
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      928b4d8c
    • A
      proc: remove proc_root_fs · 36a5aeb8
      Alexey Dobriyan 提交于
      Use creation by full path instead: "fs/foo".
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36a5aeb8
    • A
      proc: remove proc_bus · 9c37066d
      Alexey Dobriyan 提交于
      Remove proc_bus export and variable itself. Using pathnames works fine
      and is slightly more understandable and greppable.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c37066d
    • A
      proc: drop several "PDE valid/invalid" checks · 5e971dce
      Alexey Dobriyan 提交于
      proc-misc code is noticeably full of "if (de)" checks when PDE passed is
      always valid.  Remove them.
      
      Addition of such check in proc_lookup_de() is for failed lookup case.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5e971dce
    • A
      proc: less special case in xlate code · 7cee4e00
      Alexey Dobriyan 提交于
      If valid "parent" is passed to proc_create/remove_proc_entry(), then name of
      PDE should consist of only one path component, otherwise creation or or
      removal will fail.  However, if NULL is passed as parent then create/remove
      accept full path as a argument.  This is arbitrary restriction -- all
      infrastructure is in place.
      
      So, patch allows the following to succeed:
      
      	create_proc_entry("foo/bar", 0, pde_baz);
      	remove_proc_entry("baz/foo/bar", &proc_root);
      
      Also makes the following to behave identically:
      
      	create_proc_entry("foo/bar", 0, NULL);
      	create_proc_entry("foo/bar", 0, &proc_root);
      
      Discrepancy noticed by Den Lunev (IIRC).
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7cee4e00
    • A
      proc: simplify locking in remove_proc_entry() · f649d6d3
      Alexey Dobriyan 提交于
      proc_subdir_lock protects only modifying and walking through PDE lists, so
      after we've found PDE to remove and actually removed it from lists, there is
      no need to hold proc_subdir_lock for the rest of operation.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f649d6d3
    • R
      procfs: mem permission cleanup · 638fa202
      Roland McGrath 提交于
      This cleans up the permission checks done for /proc/PID/mem i/o calls.  It
      puts all the logic in a new function, check_mem_permission().
      
      The old code repeated the (!MAY_PTRACE(task) || !ptrace_may_attach(task))
      magical expression multiple times.  The new function does all that work in one
      place, with clear comments.
      
      The old code called security_ptrace() twice on successful checks, once in
      MAY_PTRACE() and once in __ptrace_may_attach().  Now it's only called once,
      and only if all other checks have succeeded.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      638fa202
    • A
      0d5c9f5f
    • M
      procfs task exe symlink · 925d1c40
      Matt Helsley 提交于
      The kernel implements readlink of /proc/pid/exe by getting the file from
      the first executable VMA.  Then the path to the file is reconstructed and
      reported as the result.
      
      Because of the VMA walk the code is slightly different on nommu systems.
      This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
      walking the VMAs to find the first executable file-backed VMA we store a
      reference to the exec'd file in the mm_struct.
      
      That reference would prevent the filesystem holding the executable file
      from being unmounted even after unmapping the VMAs.  So we track the number
      of VM_EXECUTABLE VMAs and drop the new reference when the last one is
      unmapped.  This avoids pinning the mounted filesystem.
      
      [akpm@linux-foundation.org: improve comments]
      [yamamoto@valinux.co.jp: fix dup_mmap]
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: David Howells <dhowells@redhat.com>
      Cc:"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NYAMAMOTO Takashi <yamamoto@valinux.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      925d1c40
    • A
      proc: print more information when removing non-empty directories · e93b4ea2
      Alexey Dobriyan 提交于
      This usually saves one recompile to insert similar printk like below. :)
      
      Sample nastygram:
      
      remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar'
      ------------[ cut here ]------------
      WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200()
      Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor
      Pid: 3034, comm: rmmod Tainted: G   M     2.6.25-rc1 #5
      
      Call Trace:
       [<ffffffff80231974>] warn_on_slowpath+0x64/0x90
       [<ffffffff80232a6e>] printk+0x4e/0x60
       [<ffffffff802d6c8a>] remove_proc_entry+0x18a/0x200
       [<ffffffff8045cd88>] mutex_lock_nested+0x1c8/0x2d0
       [<ffffffff8025f0f0>] __try_stop_module+0x0/0x40
       [<ffffffff8025effd>] sys_delete_module+0x14d/0x200
       [<ffffffff8045df3d>] lockdep_sys_exit_thunk+0x35/0x67
       [<ffffffff8031c307>] __up_read+0x27/0xa0
       [<ffffffff8045decc>] trace_hardirqs_on_thunk+0x35/0x3a
       [<ffffffff8020b6ab>] system_call_after_swapgs+0x7b/0x80
      
      ---[ end trace 10ef850597e89c54 ]---
      Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e93b4ea2
  4. 28 4月, 2008 3 次提交
  5. 23 4月, 2008 3 次提交
    • R
      [patch 6/7] vfs: mountinfo: add /proc/<pid>/mountinfo · 2d4d4864
      Ram Pai 提交于
      [mszeredi@suse.cz] rewrite and split big patch into managable chunks
      
      /proc/mounts in its current form lacks important information:
      
       - propagation state
       - root of mount for bind mounts
       - the st_dev value used within the filesystem
       - identifier for each mount and it's parent
      
      It also suffers from the following problems:
      
       - not easily extendable
       - ambiguity of mountpoints within a chrooted environment
       - doesn't distinguish between filesystem dependent and independent options
       - doesn't distinguish between per mount and per super block options
      
      This patch introduces /proc/<pid>/mountinfo which attempts to address
      all these deficiencies.
      
      Code shared between /proc/<pid>/mounts and /proc/<pid>/mountinfo is
      extracted into separate functions.
      
      Thanks to Al Viro for the help in getting the design right.
      Signed-off-by: NRam Pai <linuxram@us.ibm.com>
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      2d4d4864
    • M
      [patch 5/7] vfs: mountinfo: allow using process root · a1a2c409
      Miklos Szeredi 提交于
      Allow /proc/<pid>/mountinfo to use the root of <pid> to calculate
      mountpoints.
      
       - move definition of 'struct proc_mounts' to <linux/mnt_namespace.h>
       - add the process's namespace and root to this structure
       - pass a pointer to 'struct proc_mounts' into seq_operations
      
      In addition the following cleanups are made:
      
       - use a common open function for /proc/<pid>/{mounts,mountstat}
       - surround namespace.c part of these proc files with #ifdef CONFIG_PROC_FS
       - make the seq_operations structures const
      Signed-off-by: NMiklos Szeredi <mszeredi@suse.cz>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      a1a2c409
    • A
      [PATCH] proc_readfd_common() race fix · 9b4f526c
      Al Viro 提交于
      Since we drop the rcu_read_lock inside the loop, we can't assume
      that files->fdt will remain unchanged (and not freed) between
      iterations.
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      9b4f526c
  6. 02 4月, 2008 1 次提交
  7. 28 3月, 2008 1 次提交
  8. 26 3月, 2008 1 次提交
  9. 23 3月, 2008 1 次提交
  10. 21 3月, 2008 1 次提交
    • A
      [NET]: Fix permissions of /proc/net · 4f42c288
      Andre Noll 提交于
      commit e9720acd ([NET]: Make /proc/net a symlink on /proc/self/net (v3))
      broke ganglia and probably other applications that read /proc/net/dev.
      
      This is due to the change of permissions of /proc/net that was
      introduced in that commit.
      
      Before: dr-xr-xr-x 5 root root 0 Mar 19 11:30 /proc/net
      After: dr-xr--r-- 5 root root 0 Mar 19 11:29 /proc/self/net
      
      This patch restores the permissions to the old value which makes
      ganglia happy again.
      
      Pavel Emelyanov says:
      
      	This also broke the postfix, as it was reported in bug #10286
      	and described in detail by Benjamin.
      Signed-off-by: NAndre Noll <maan@systemlinux.org>
      Acked-by: NPavel Emelyanov <xemul@openvz.org>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      4f42c288
  11. 18 3月, 2008 1 次提交
  12. 14 3月, 2008 1 次提交
  13. 12 3月, 2008 1 次提交
  14. 08 3月, 2008 1 次提交
    • P
      [NET]: Make /proc/net a symlink on /proc/self/net (v3) · e9720acd
      Pavel Emelyanov 提交于
      Current /proc/net is done with so called "shadows", but current
      implementation is broken and has little chances to get fixed.
      
      The problem is that dentries subtree of /proc/net directory has
      fancy revalidation rules to make processes living in different
      net namespaces see different entries in /proc/net subtree, but
      currently, tasks see in the /proc/net subdir the contents of any
      other namespace, depending on who opened the file first.
      
      The proposed fix is to turn /proc/net into a symlink, which points
      to /proc/self/net, which in turn shows what previously was in
      /proc/net - the network-related info, from the net namespace the
      appropriate task lives in.
      
      # ls -l /proc/net
      lrwxrwxrwx  1 root root 8 Mar  5 15:17 /proc/net -> self/net
      
      In other words - this behaves like /proc/mounts, but unlike
      "mounts", "net" is not a file, but a directory.
      
      Changes from v2:
      * Fixed discrepancy of /proc/net nlink count and selinux labeling
        screwup pointed out by Stephen.
      
        To get the correct nlink count the ->getattr callback for /proc/net
        is overridden to read one from the net->proc_net entry.
      
        To make selinux still work the net->proc_net entry is initialized
        properly, i.e. with the "net" name and the proc_net parent.
      
      Selinux fixes are
      Acked-by: NStephen Smalley <sds@tycho.nsa.gov>
      
      Changes from v1:
      * Fixed a task_struct leak in get_proc_task_net, pointed out by Paul.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: N"Eric W. Biederman" <ebiederm@xmission.com>
      Signed-off-by: NDavid S. Miller <davem@davemloft.net>
      e9720acd
  15. 04 3月, 2008 1 次提交
  16. 25 2月, 2008 3 次提交