1. 24 3月, 2011 1 次提交
  2. 15 10月, 2010 1 次提交
    • L
      Don't dump task struct in a.out core-dumps · 0eead9ab
      Linus Torvalds 提交于
      akiphie points out that a.out core-dumps have that odd task struct
      dumping that was never used and was never really a good idea (it goes
      back into the mists of history, probably the original core-dumping
      code).  Just remove it.
      
      Also do the access_ok() check on dump_write().  It probably doesn't
      matter (since normal filesystems all seem to do it anyway), but he
      points out that it's normally done by the VFS layer, so ...
      
      [ I suspect that we should possibly do "vfs_write()" instead of
        calling ->write directly.  That also does the whole fsnotify and write
        statistics thing, which may or may not be a good idea. ]
      
      And just to be anal, do this all for the x86-64 32-bit a.out emulation
      code too, even though it's not enabled (and won't currently even
      compile)
      Reported-by: Nakiphie <akiphie@lavabit.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0eead9ab
  3. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  4. 11 2月, 2010 1 次提交
    • J
      x86, ia32_aout: do not kill argument mapping · 318f6b22
      Jiri Slaby 提交于
      Do not set current->mm->mmap to NULL in 32-bit emulation on 64-bit
      load_aout_binary after flush_old_exec as it would destroy already
      set brpm mapping with arguments.
      
      Introduced by b6a2fea3
      mm: variable length argument support
      where the argument mapping in bprm was added.
      
      [ hpa: this is a regression from 2.6.22... time to kill a.out? ]
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      LKML-Reference: <1265831716-7668-1-git-send-email-jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ollie Wild <aaw@google.com>
      Cc: x86@kernel.org
      Cc: <stable@kernel.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      318f6b22
  5. 30 1月, 2010 2 次提交
    • H
      x86: get rid of the insane TIF_ABI_PENDING bit · 05d43ed8
      H. Peter Anvin 提交于
      Now that the previous commit made it possible to do the personality
      setting at the point of no return, we do just that for ELF binaries.
      And suddenly all the reasons for that insane TIF_ABI_PENDING bit go
      away, and we can just make SET_PERSONALITY() just do the obvious thing
      for a 32-bit compat process.
      
      Everything becomes much more straightforward this way.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      05d43ed8
    • L
      Split 'flush_old_exec' into two functions · 221af7f8
      Linus Torvalds 提交于
      'flush_old_exec()' is the point of no return when doing an execve(), and
      it is pretty badly misnamed.  It doesn't just flush the old executable
      environment, it also starts up the new one.
      
      Which is very inconvenient for things like setting up the new
      personality, because we want the new personality to affect the starting
      of the new environment, but at the same time we do _not_ want the new
      personality to take effect if flushing the old one fails.
      
      As a result, the x86-64 '32-bit' personality is actually done using this
      insane "I'm going to change the ABI, but I haven't done it yet" bit
      (TIF_ABI_PENDING), with SET_PERSONALITY() not actually setting the
      personality, but just the "pending" bit, so that "flush_thread()" can do
      the actual personality magic.
      
      This patch in no way changes any of that insanity, but it does split the
      'flush_old_exec()' function up into a preparatory part that can fail
      (still called flush_old_exec()), and a new part that will actually set
      up the new exec environment (setup_new_exec()).  All callers are changed
      to trivially comply with the new world order.
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Cc: stable@kernel.org
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      221af7f8
  6. 28 1月, 2010 1 次提交
    • J
      x86: Use helpers for rlimits · 2854e72b
      Jiri Slaby 提交于
      Make sure compiler won't do weird things with limits.  Fetching them
      twice may return 2 different values after writable limits are
      implemented.
      
      We can either use rlimit helpers added in
      3e10e716 or ACCESS_ONCE if not
      applicable; this patch uses the helpers.
      Signed-off-by: NJiri Slaby <jslaby@suse.cz>
      LKML-Reference: <1264609942-24621-1-git-send-email-jslaby@suse.cz>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      2854e72b
  7. 14 11月, 2008 1 次提交
    • D
      CRED: Make execve() take advantage of copy-on-write credentials · a6f76f23
      David Howells 提交于
      Make execve() take advantage of copy-on-write credentials, allowing it to set
      up the credentials in advance, and then commit the whole lot after the point
      of no return.
      
      This patch and the preceding patches have been tested with the LTP SELinux
      testsuite.
      
      This patch makes several logical sets of alteration:
      
       (1) execve().
      
           The credential bits from struct linux_binprm are, for the most part,
           replaced with a single credentials pointer (bprm->cred).  This means that
           all the creds can be calculated in advance and then applied at the point
           of no return with no possibility of failure.
      
           I would like to replace bprm->cap_effective with:
      
      	cap_isclear(bprm->cap_effective)
      
           but this seems impossible due to special behaviour for processes of pid 1
           (they always retain their parent's capability masks where normally they'd
           be changed - see cap_bprm_set_creds()).
      
           The following sequence of events now happens:
      
           (a) At the start of do_execve, the current task's cred_exec_mutex is
           	 locked to prevent PTRACE_ATTACH from obsoleting the calculation of
           	 creds that we make.
      
           (a) prepare_exec_creds() is then called to make a copy of the current
           	 task's credentials and prepare it.  This copy is then assigned to
           	 bprm->cred.
      
        	 This renders security_bprm_alloc() and security_bprm_free()
           	 unnecessary, and so they've been removed.
      
           (b) The determination of unsafe execution is now performed immediately
           	 after (a) rather than later on in the code.  The result is stored in
           	 bprm->unsafe for future reference.
      
           (c) prepare_binprm() is called, possibly multiple times.
      
           	 (i) This applies the result of set[ug]id binaries to the new creds
           	     attached to bprm->cred.  Personality bit clearance is recorded,
           	     but now deferred on the basis that the exec procedure may yet
           	     fail.
      
               (ii) This then calls the new security_bprm_set_creds().  This should
      	     calculate the new LSM and capability credentials into *bprm->cred.
      
      	     This folds together security_bprm_set() and parts of
      	     security_bprm_apply_creds() (these two have been removed).
      	     Anything that might fail must be done at this point.
      
               (iii) bprm->cred_prepared is set to 1.
      
      	     bprm->cred_prepared is 0 on the first pass of the security
      	     calculations, and 1 on all subsequent passes.  This allows SELinux
      	     in (ii) to base its calculations only on the initial script and
      	     not on the interpreter.
      
           (d) flush_old_exec() is called to commit the task to execution.  This
           	 performs the following steps with regard to credentials:
      
      	 (i) Clear pdeath_signal and set dumpable on certain circumstances that
      	     may not be covered by commit_creds().
      
               (ii) Clear any bits in current->personality that were deferred from
                   (c.i).
      
           (e) install_exec_creds() [compute_creds() as was] is called to install the
           	 new credentials.  This performs the following steps with regard to
           	 credentials:
      
               (i) Calls security_bprm_committing_creds() to apply any security
                   requirements, such as flushing unauthorised files in SELinux, that
                   must be done before the credentials are changed.
      
      	     This is made up of bits of security_bprm_apply_creds() and
      	     security_bprm_post_apply_creds(), both of which have been removed.
      	     This function is not allowed to fail; anything that might fail
      	     must have been done in (c.ii).
      
               (ii) Calls commit_creds() to apply the new credentials in a single
                   assignment (more or less).  Possibly pdeath_signal and dumpable
                   should be part of struct creds.
      
      	 (iii) Unlocks the task's cred_replace_mutex, thus allowing
      	     PTRACE_ATTACH to take place.
      
               (iv) Clears The bprm->cred pointer as the credentials it was holding
                   are now immutable.
      
               (v) Calls security_bprm_committed_creds() to apply any security
                   alterations that must be done after the creds have been changed.
                   SELinux uses this to flush signals and signal handlers.
      
           (f) If an error occurs before (d.i), bprm_free() will call abort_creds()
           	 to destroy the proposed new credentials and will then unlock
           	 cred_replace_mutex.  No changes to the credentials will have been
           	 made.
      
       (2) LSM interface.
      
           A number of functions have been changed, added or removed:
      
           (*) security_bprm_alloc(), ->bprm_alloc_security()
           (*) security_bprm_free(), ->bprm_free_security()
      
           	 Removed in favour of preparing new credentials and modifying those.
      
           (*) security_bprm_apply_creds(), ->bprm_apply_creds()
           (*) security_bprm_post_apply_creds(), ->bprm_post_apply_creds()
      
           	 Removed; split between security_bprm_set_creds(),
           	 security_bprm_committing_creds() and security_bprm_committed_creds().
      
           (*) security_bprm_set(), ->bprm_set_security()
      
           	 Removed; folded into security_bprm_set_creds().
      
           (*) security_bprm_set_creds(), ->bprm_set_creds()
      
           	 New.  The new credentials in bprm->creds should be checked and set up
           	 as appropriate.  bprm->cred_prepared is 0 on the first call, 1 on the
           	 second and subsequent calls.
      
           (*) security_bprm_committing_creds(), ->bprm_committing_creds()
           (*) security_bprm_committed_creds(), ->bprm_committed_creds()
      
           	 New.  Apply the security effects of the new credentials.  This
           	 includes closing unauthorised files in SELinux.  This function may not
           	 fail.  When the former is called, the creds haven't yet been applied
           	 to the process; when the latter is called, they have.
      
       	 The former may access bprm->cred, the latter may not.
      
       (3) SELinux.
      
           SELinux has a number of changes, in addition to those to support the LSM
           interface changes mentioned above:
      
           (a) The bprm_security_struct struct has been removed in favour of using
           	 the credentials-under-construction approach.
      
           (c) flush_unauthorized_files() now takes a cred pointer and passes it on
           	 to inode_has_perm(), file_has_perm() and dentry_open().
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Acked-by: NJames Morris <jmorris@namei.org>
      Acked-by: NSerge Hallyn <serue@us.ibm.com>
      Signed-off-by: NJames Morris <jmorris@namei.org>
      a6f76f23
  8. 20 8月, 2008 1 次提交
  9. 27 7月, 2008 1 次提交
  10. 08 2月, 2008 1 次提交
    • H
      Sanitize the type of struct user.u_ar0 · 6e16d89b
      H. Peter Anvin 提交于
      struct user.u_ar0 is defined to contain a pointer offset on all
      architectures in which it is defined (all architectures which define an
      a.out format except SPARC.) However, it has a pointer type in the headers,
      which is pointless -- <asm/user.h> is not exported to userspace, and it
      just makes the code messy.
      
      Redefine the field as "unsigned long" (which is the same size as a pointer
      on all Linux architectures) and change the setting code to user offsetof()
      instead of hand-coded arithmetic.
      
      Cc: Linux Arch Mailing List <linux-arch@vger.kernel.org>
      Cc: Bryan Wu <bryan.wu@analog.com>
      Cc: Roman Zippel <zippel@linux-m68k.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Russell King <rmk@arm.linux.org.uk>
      Cc: Lennert Buytenhek <kernel@wantstofly.org>
      Cc: Håvard Skinnemoen <hskinnemoen@atmel.com>
      Cc: Mikael Starvik <starvik@axis.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Hirokazu Takata <takata@linux-m32r.org>
      Cc: Ralf Baechle <ralf@linux-mips.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
      Cc: Paul Mundt <lethal@linux-sh.org>
      Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      6e16d89b
  11. 30 1月, 2008 3 次提交
  12. 18 10月, 2007 1 次提交
  13. 17 10月, 2007 1 次提交
    • N
      core_pattern: ignore RLIMIT_CORE if core_pattern is a pipe · 7dc0b22e
      Neil Horman 提交于
      For some time /proc/sys/kernel/core_pattern has been able to set its output
      destination as a pipe, allowing a user space helper to receive and
      intellegently process a core.  This infrastructure however has some
      shortcommings which can be enhanced.  Specifically:
      
      1) The coredump code in the kernel should ignore RLIMIT_CORE limitation
         when core_pattern is a pipe, since file system resources are not being
         consumed in this case, unless the user application wishes to save the core,
         at which point the app is restricted by usual file system limits and
         restrictions.
      
      2) The core_pattern code should be able to parse and pass options to the
         user space helper as an argv array.  The real core limit of the uid of the
         crashing proces should also be passable to the user space helper (since it
         is overridden to zero when called).
      
      3) Some miscellaneous bugs need to be cleaned up (specifically the
         recognition of a recursive core dump, should the user mode helper itself
         crash.  Also, the core dump code in the kernel should not wait for the user
         mode helper to exit, since the same context is responsible for writing to
         the pipe, and a read of the pipe by the user mode helper will result in a
         deadlock.
      
      This patch:
      
      Remove the check of RLIMIT_CORE if core_pattern is a pipe.  In the event that
      core_pattern is a pipe, the entire core will be fed to the user mode helper.
      Signed-off-by: NNeil Horman <nhorman@tuxdriver.com>
      Cc: <martin.pitt@ubuntu.com>
      Cc: <wwoods@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@goop.org>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      7dc0b22e
  14. 11 10月, 2007 1 次提交
  15. 20 7月, 2007 1 次提交
  16. 11 1月, 2007 1 次提交
    • A
      [PATCH] x86-64: Fix warnings in ia32_aout.c · 74019699
      Andi Kleen 提交于
      Fix
      
      linux/arch/x86_64/ia32/ia32_aout.c: In function ‘create_aout_tables’:
      linux/arch/x86_64/ia32/ia32_aout.c:244: warning: cast from pointer to integer of different size
      linux/arch/x86_64/ia32/ia32_aout.c:253: warning: cast from pointer to integer of different size
      
      with gcc 4.3
      Signed-off-by: NAndi Kleen <ak@suse.de>
      74019699
  17. 09 12月, 2006 1 次提交
  18. 26 9月, 2006 1 次提交
  19. 15 11月, 2005 1 次提交
  20. 30 10月, 2005 1 次提交
  21. 28 7月, 2005 1 次提交
  22. 22 6月, 2005 1 次提交
    • W
      [PATCH] Avoiding mmap fragmentation · 1363c3cd
      Wolfgang Wander 提交于
      Ingo recently introduced a great speedup for allocating new mmaps using the
      free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
      causes huge performance increases in thread creation.
      
      The downside of this patch is that it does lead to fragmentation in the
      mmap-ed areas (visible via /proc/self/maps), such that some applications
      that work fine under 2.4 kernels quickly run out of memory on any 2.6
      kernel.
      
      The problem is twofold:
      
        1) the free_area_cache is used to continue a search for memory where
           the last search ended.  Before the change new areas were always
           searched from the base address on.
      
           So now new small areas are cluttering holes of all sizes
           throughout the whole mmap-able region whereas before small holes
           tended to close holes near the base leaving holes far from the base
           large and available for larger requests.
      
        2) the free_area_cache also is set to the location of the last
           munmap-ed area so in scenarios where we allocate e.g.  five regions of
           1K each, then free regions 4 2 3 in this order the next request for 1K
           will be placed in the position of the old region 3, whereas before we
           appended it to the still active region 1, placing it at the location
           of the old region 2.  Before we had 1 free region of 2K, now we only
           get two free regions of 1K -> fragmentation.
      
      The patch addresses thes issues by introducing yet another cache descriptor
      cached_hole_size that contains the largest known hole size below the
      current free_area_cache.  If a new request comes in the size is compared
      against the cached_hole_size and if the request can be filled with a hole
      below free_area_cache the search is started from the base instead.
      
      The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
      (earlier posted) leakme.c test program terminates after 50000+ iterations
      with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
      (as expected) with thread creation, Ingo's test_str02 with 20000 threads
      requires 0.7s system time.
      
      Taking out Ingo's patch (un-patch available per request) by basically
      deleting all mentions of free_area_cache from the kernel and starting the
      search for new memory always at the respective bases we observe: leakme
      terminates successfully with 11 distinctive hardly fragmented areas in
      /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
      time for Ingo's test_str02 with 20000 threads.
      
      Now - drumroll ;-) the appended patch works fine with leakme: it ends with
      only 7 distinct areas in /proc/self/maps and also thread creation seems
      sufficiently fast with 0.71s for 20000 threads.
      Signed-off-by: NWolfgang Wander <wwc@rentec.com>
      Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
      Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
      Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      1363c3cd
  23. 17 4月, 2005 1 次提交
    • L
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds 提交于
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4