1. 14 7月, 2008 1 次提交
    • S
      Security: split proc ptrace checking into read vs. attach · 006ebb40
      Stephen Smalley 提交于
      Enable security modules to distinguish reading of process state via
      proc from full ptrace access by renaming ptrace_may_attach to
      ptrace_may_access and adding a mode argument indicating whether only
      read access or full attach access is requested.  This allows security
      modules to permit access to reading process state without granting
      full ptrace access.  The base DAC/capability checking remains unchanged.
      
      Read access to /proc/pid/mem continues to apply a full ptrace attach
      check since check_mem_permission() already requires the current task
      to already be ptracing the target.  The other ptrace checks within
      proc for elements like environ, maps, and fds are changed to pass the
      read mode instead of attach.
      
      In the SELinux case, we model such reading of process state as a
      reading of a proc file labeled with the target process' label.  This
      enables SELinux policy to permit such reading of process state without
      permitting control or manipulation of the target process, as there are
      a number of cases where programs probe for such information via proc
      but do not need to be able to control the target (e.g. procps,
      lsof, PolicyKit, ConsoleKit).  At present we have to choose between
      allowing full ptrace in policy (more permissive than required/desired)
      or breaking functionality (or in some cases just silencing the denials
      via dontaudit rules but this can hide genuine attacks).
      
      This version of the patch incorporates comments from Casey Schaufler
      (change/replace existing ptrace_may_attach interface, pass access
      mode), and Chris Wright (provide greater consistency in the checking).
      
      Note that like their predecessors __ptrace_may_attach and
      ptrace_may_attach, the __ptrace_may_access and ptrace_may_access
      interfaces use different return value conventions from each other (0
      or -errno vs. 1 or 0).  I retained this difference to avoid any
      changes to the caller logic but made the difference clearer by
      changing the latter interface to return a bool rather than an int and
      by adding a comment about it to ptrace.h for any future callers.
      Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
      Acked-by: NChris Wright <chrisw@sous-sol.org>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      006ebb40
  2. 06 7月, 2008 2 次提交
  3. 13 6月, 2008 2 次提交
  4. 07 6月, 2008 4 次提交
  5. 01 6月, 2008 1 次提交
    • A
      capabilities: remain source compatible with 32-bit raw legacy capability support. · ca05a99a
      Andrew G. Morgan 提交于
      Source code out there hard-codes a notion of what the
      _LINUX_CAPABILITY_VERSION #define means in terms of the semantics of the
      raw capability system calls capget() and capset().  Its unfortunate, but
      true.
      
      Since the confusing header file has been in a released kernel, there is
      software that is erroneously using 64-bit capabilities with the semantics
      of 32-bit compatibilities.  These recently compiled programs may suffer
      corruption of their memory when sys_getcap() overwrites more memory than
      they are coded to expect, and the raising of added capabilities when using
      sys_capset().
      
      As such, this patch does a number of things to clean up the situation
      for all. It
      
        1. forces the _LINUX_CAPABILITY_VERSION define to always retain its
           legacy value.
      
        2. adopts a new #define strategy for the kernel's internal
           implementation of the preferred magic.
      
        3. deprecates v2 capability magic in favor of a new (v3) magic
           number. The functionality of v3 is entirely equivalent to v2,
           the only difference being that the v2 magic causes the kernel
           to log a "deprecated" warning so the admin can find applications
           that may be using v2 inappropriately.
      
      [User space code continues to be encouraged to use the libcap API which
      protects the application from details like this.  libcap-2.10 is the first
      to support v3 capabilities.]
      
      Fixes issue reported in https://bugzilla.redhat.com/show_bug.cgi?id=447518.
      Thanks to Bojan Smojver for the report.
      
      [akpm@linux-foundation.org: s/depreciate/deprecate/g]
      [akpm@linux-foundation.org: be robust about put_user size]
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: NAndrew G. Morgan <morgan@kernel.org>
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Cc: Bojan Smojver <bojan@rexursive.com>
      Cc: stable@kernel.org
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NChris Wright <chrisw@sous-sol.org>
      ca05a99a
  6. 25 5月, 2008 2 次提交
  7. 17 5月, 2008 1 次提交
  8. 13 5月, 2008 1 次提交
  9. 09 5月, 2008 1 次提交
  10. 05 5月, 2008 1 次提交
  11. 02 5月, 2008 2 次提交
  12. 30 4月, 2008 4 次提交
  13. 29 4月, 2008 16 次提交
    • P
      sysctl: add the ->permissions callback on the ctl_table_root · d7321cd6
      Pavel Emelyanov 提交于
      When reading from/writing to some table, a root, which this table came from,
      may affect this table's permissions, depending on who is working with the
      table.
      
      The core hunk is at the bottom of this patch.  All the rest is just pushing
      the ctl_table_root argument up to the sysctl_perm() function.
      
      This will be mostly (only?) used in the net sysctls.
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: NDavid S. Miller <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Denis V. Lunev <den@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      d7321cd6
    • P
      sysctl: merge equal proc_sys_read and proc_sys_write · 7708bfb1
      Pavel Emelyanov 提交于
      Many (most of) sysctls do not have a per-container sense.  E.g.
      kernel.print_fatal_signals, vm.panic_on_oom, net.core.netdev_budget and so on
      and so forth.  Besides, tuning then from inside a container is not even
      secure.  On the other hand, hiding them completely from the container's tasks
      sometimes causes user-space to stop working.
      
      When developing net sysctl, the common practice was to duplicate a table and
      drop the write bits in table->mode, but this approach was not very elegant,
      lead to excessive memory consumption and was not suitable in general.
      
      Here's the alternative solution.  To facilitate the per-container sysctls
      ctl_table_root-s were introduced.  Each root contains a list of
      ctl_table_header-s that are visible to different namespaces.  The idea of this
      set is to add the permissions() callback on the ctl_table_root to allow ctl
      root limit permissions to the same ctl_table-s.
      
      The main user of this functionality is the net-namespaces code, but later this
      will (should) be used by more and more namespaces, containers and control
      groups.
      
      Actually, this idea's core is in a single hunk in the third patch.  First two
      patches are cleanups for sysctl code, while the third one mostly extends the
      arguments set of some sysctl functions.
      
      This patch:
      
      These ->read and ->write callbacks act in a very similar way, so merge these
      paths to reduce the number of places to patch later and shrink the .text size
      (a bit).
      Signed-off-by: NPavel Emelyanov <xemul@openvz.org>
      Acked-by: N"David S. Miller" <davem@davemloft.net>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Alexey Dobriyan <adobriyan@sw.ru>
      Cc: Denis V. Lunev <den@openvz.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7708bfb1
    • D
      proc: introduce proc_create_data to setup de->data · 59b74351
      Denis V. Lunev 提交于
      This set of patches fixes an proc ->open'less usage due to ->proc_fops flip in
      the most part of the kernel code.  The original OOPS is described in the
      commit 2d3a4e36:
      
          Typical PDE creation code looks like:
      
          	pde = create_proc_entry("foo", 0, NULL);
          	if (pde)
          		pde->proc_fops = &foo_proc_fops;
      
          Notice that PDE is first created, only then ->proc_fops is set up to
          final value. This is a problem because right after creation
          a) PDE is fully visible in /proc , and
          b) ->proc_fops are proc_file_operations which do not have ->open callback. So, it's
             possible to ->read without ->open (see one class of oopses below).
      
          The fix is new API called proc_create() which makes sure ->proc_fops are
          set up before gluing PDE to main tree. Typical new code looks like:
      
          	pde = proc_create("foo", 0, NULL, &foo_proc_fops);
          	if (!pde)
          		return -ENOMEM;
      
          Fix most networking users for a start.
      
          In the long run, create_proc_entry() for regular files will go.
      
      In addition to this, proc_create_data is introduced to fix reading from
      proc without PDE->data. The race is basically the same as above.
      
      create_proc_entries is replaced in the entire kernel code as new method
      is also simply better.
      
      This patch:
      
      The problem is the same as for de->proc_fops.  Right now PDE becomes visible
      without data set.  So, the entry could be looked up without data.  This, in
      most cases, will simply OOPS.
      
      proc_create_data call is created to address this issue.  proc_create now
      becomes a wrapper around it.
      Signed-off-by: NDenis V. Lunev <den@openvz.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Alessandro Zummo <a.zummo@towertech.it>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
      Cc: Chris Mason <chris.mason@oracle.com>
      Acked-by: NDavid Howells <dhowells@redhat.com>
      Cc: Dmitry Torokhov <dtor@mail.ru>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Grant Grundler <grundler@parisc-linux.org>
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: Haavard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
      Cc: Jaroslav Kysela <perex@suse.cz>
      Cc: Jeff Garzik <jgarzik@pobox.com>
      Cc: Jeff Mahoney <jeffm@suse.com>
      Cc: Jesper Nilsson <jesper.nilsson@axis.com>
      Cc: Karsten Keil <kkeil@suse.de>
      Cc: Kyle McMartin <kyle@parisc-linux.org>
      Cc: Len Brown <lenb@kernel.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Cc: Matthew Wilcox <matthew@wil.cx>
      Cc: Mauro Carvalho Chehab <mchehab@infradead.org>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Nadia Derbey <Nadia.Derbey@bull.net>
      Cc: Neil Brown <neilb@suse.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Peter Osterlund <petero2@telia.com>
      Cc: Pierre Peiffer <peifferp@gmail.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Takashi Iwai <tiwai@suse.de>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Trond Myklebust <trond.myklebust@fys.uio.no>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      59b74351
    • A
      proc: convert /proc/tty/ldiscs to seq_file interface · b640a89d
      Alexey Dobriyan 提交于
      Note: THIS_MODULE and header addition aren't technically needed because
            this code is not modular, but let's keep it anyway because people
            can copy this code into modular code.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b640a89d
    • A
      proc: remove ->get_info infrastructure · 8731f14d
      Alexey Dobriyan 提交于
      Now that last dozen or so users of ->get_info were removed, ditch it too.
      Everyone sane shouldd have switched to seq_file interface long ago.
      
      P.S.: Co-existing 3 interfaces (->get_info/->read_proc/->proc_fops) for proc
            is long-standing crap, BTW, thus
            a) put ->read_proc/->write_proc/read_proc_entry() users on death row,
            b) new such users should be rejected,
            c) everyone is encouraged to convert his favourite ->read_proc user or
               I'll do it, lazy bastards.
      Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      8731f14d
    • A
      proc: remove proc_root from drivers · c74c120a
      Alexey Dobriyan 提交于
      Remove proc_root export.  Creation and removal works well if parent PDE is
      supplied as NULL -- it worked always that way.
      
      So, one useless export removed and consistency added, some drivers created
      PDEs with &proc_root as parent but removed them as NULL and so on.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      c74c120a
    • A
      proc: remove proc_root_driver · 928b4d8c
      Alexey Dobriyan 提交于
      Use creation by full path: "driver/foo".
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      928b4d8c
    • A
      proc: remove proc_root_fs · 36a5aeb8
      Alexey Dobriyan 提交于
      Use creation by full path instead: "fs/foo".
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      36a5aeb8
    • A
      proc: remove proc_bus · 9c37066d
      Alexey Dobriyan 提交于
      Remove proc_bus export and variable itself. Using pathnames works fine
      and is slightly more understandable and greppable.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      9c37066d
    • A
      proc: drop several "PDE valid/invalid" checks · 5e971dce
      Alexey Dobriyan 提交于
      proc-misc code is noticeably full of "if (de)" checks when PDE passed is
      always valid.  Remove them.
      
      Addition of such check in proc_lookup_de() is for failed lookup case.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      5e971dce
    • A
      proc: less special case in xlate code · 7cee4e00
      Alexey Dobriyan 提交于
      If valid "parent" is passed to proc_create/remove_proc_entry(), then name of
      PDE should consist of only one path component, otherwise creation or or
      removal will fail.  However, if NULL is passed as parent then create/remove
      accept full path as a argument.  This is arbitrary restriction -- all
      infrastructure is in place.
      
      So, patch allows the following to succeed:
      
      	create_proc_entry("foo/bar", 0, pde_baz);
      	remove_proc_entry("baz/foo/bar", &proc_root);
      
      Also makes the following to behave identically:
      
      	create_proc_entry("foo/bar", 0, NULL);
      	create_proc_entry("foo/bar", 0, &proc_root);
      
      Discrepancy noticed by Den Lunev (IIRC).
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7cee4e00
    • A
      proc: simplify locking in remove_proc_entry() · f649d6d3
      Alexey Dobriyan 提交于
      proc_subdir_lock protects only modifying and walking through PDE lists, so
      after we've found PDE to remove and actually removed it from lists, there is
      no need to hold proc_subdir_lock for the rest of operation.
      Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      f649d6d3
    • R
      procfs: mem permission cleanup · 638fa202
      Roland McGrath 提交于
      This cleans up the permission checks done for /proc/PID/mem i/o calls.  It
      puts all the logic in a new function, check_mem_permission().
      
      The old code repeated the (!MAY_PTRACE(task) || !ptrace_may_attach(task))
      magical expression multiple times.  The new function does all that work in one
      place, with clear comments.
      
      The old code called security_ptrace() twice on successful checks, once in
      MAY_PTRACE() and once in __ptrace_may_attach().  Now it's only called once,
      and only if all other checks have succeeded.
      Signed-off-by: NRoland McGrath <roland@redhat.com>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      638fa202
    • A
      0d5c9f5f
    • M
      procfs task exe symlink · 925d1c40
      Matt Helsley 提交于
      The kernel implements readlink of /proc/pid/exe by getting the file from
      the first executable VMA.  Then the path to the file is reconstructed and
      reported as the result.
      
      Because of the VMA walk the code is slightly different on nommu systems.
      This patch avoids separate /proc/pid/exe code on nommu systems.  Instead of
      walking the VMAs to find the first executable file-backed VMA we store a
      reference to the exec'd file in the mm_struct.
      
      That reference would prevent the filesystem holding the executable file
      from being unmounted even after unmapping the VMAs.  So we track the number
      of VM_EXECUTABLE VMAs and drop the new reference when the last one is
      unmapped.  This avoids pinning the mounted filesystem.
      
      [akpm@linux-foundation.org: improve comments]
      [yamamoto@valinux.co.jp: fix dup_mmap]
      Signed-off-by: NMatt Helsley <matthltc@us.ibm.com>
      Cc: Oleg Nesterov <oleg@tv-sign.ru>
      Cc: David Howells <dhowells@redhat.com>
      Cc:"Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Hugh Dickins <hugh@veritas.com>
      Signed-off-by: NYAMAMOTO Takashi <yamamoto@valinux.co.jp>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      925d1c40
    • A
      proc: print more information when removing non-empty directories · e93b4ea2
      Alexey Dobriyan 提交于
      This usually saves one recompile to insert similar printk like below. :)
      
      Sample nastygram:
      
      remove_proc_entry: removing non-empty directory '/proc/foo', leaking at least 'bar'
      ------------[ cut here ]------------
      WARNING: at fs/proc/generic.c:776 remove_proc_entry+0x18a/0x200()
      Modules linked in: foo(-) container fan battery dock sbs ac sbshc backlight ipv6 loop af_packet amd_rng sr_mod i2c_amd8111 i2c_amd756 cdrom i2c_core button thermal processor
      Pid: 3034, comm: rmmod Tainted: G   M     2.6.25-rc1 #5
      
      Call Trace:
       [<ffffffff80231974>] warn_on_slowpath+0x64/0x90
       [<ffffffff80232a6e>] printk+0x4e/0x60
       [<ffffffff802d6c8a>] remove_proc_entry+0x18a/0x200
       [<ffffffff8045cd88>] mutex_lock_nested+0x1c8/0x2d0
       [<ffffffff8025f0f0>] __try_stop_module+0x0/0x40
       [<ffffffff8025effd>] sys_delete_module+0x14d/0x200
       [<ffffffff8045df3d>] lockdep_sys_exit_thunk+0x35/0x67
       [<ffffffff8031c307>] __up_read+0x27/0xa0
       [<ffffffff8045decc>] trace_hardirqs_on_thunk+0x35/0x3a
       [<ffffffff8020b6ab>] system_call_after_swapgs+0x7b/0x80
      
      ---[ end trace 10ef850597e89c54 ]---
      Signed-off-by: NAlexey Dobriyan <adobriyan@sw.ru>
      Cc: Arjan van de Ven <arjan@linux.intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      e93b4ea2
  14. 28 4月, 2008 2 次提交