1. 21 3月, 2006 12 次提交
    • A
      [PATCH] audit string fields interface + consumer · 93315ed6
      Amy Griffis 提交于
      Updated patch to dynamically allocate audit rule fields in kernel's
      internal representation.  Added unlikely() calls for testing memory
      allocation result.
      
      Amy Griffis wrote:     [Wed Jan 11 2006, 02:02:31PM EST]
      > Modify audit's kernel-userspace interface to allow the specification
      > of string fields in audit rules.
      >
      > Signed-off-by: Amy Griffis <amy.griffis@hp.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      (cherry picked from 5ffc4a863f92351b720fe3e9c5cd647accff9e03 commit)
      93315ed6
    • D
    • D
      [PATCH] Fix audit record filtering with !CONFIG_AUDITSYSCALL · fe7752ba
      David Woodhouse 提交于
      This fixes the per-user and per-message-type filtering when syscall
      auditing isn't enabled.
      
      [AV: folded followup fix from the same author]
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      fe7752ba
    • D
      [PATCH] Miscellaneous bug and warning fixes · 7306a0b9
      Dustin Kirkland 提交于
      This patch fixes a couple of bugs revealed in new features recently
      added to -mm1:
      * fixes warnings due to inconsistent use of const struct inode *inode
      * fixes bug that prevent a kernel from booting with audit on, and SELinux off
        due to a missing function in security/dummy.c
      * fixes a bug that throws spurious audit_panic() messages due to a missing
        return just before an error_path label
      * some reasonable house cleaning in audit_ipc_context(),
        audit_inode_context(), and audit_log_task_context()
      Signed-off-by: NDustin Kirkland <dustin.kirkland@us.ibm.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      7306a0b9
    • D
      [PATCH] Capture selinux subject/object context information. · 8c8570fb
      Dustin Kirkland 提交于
      This patch extends existing audit records with subject/object context
      information. Audit records associated with filesystem inodes, ipc, and
      tasks now contain SELinux label information in the field "subj" if the
      item is performing the action, or in "obj" if the item is the receiver
      of an action.
      
      These labels are collected via hooks in SELinux and appended to the
      appropriate record in the audit code.
      
      This additional information is required for Common Criteria Labeled
      Security Protection Profile (LSPP).
      
      [AV: fixed kmalloc flags use]
      [folded leak fixes]
      [folded cleanup from akpm (kfree(NULL)]
      [folded audit_inode_context() leak fix]
      [folded akpm's fix for audit_ipc_perm() definition in case of !CONFIG_AUDIT]
      Signed-off-by: NDustin Kirkland <dustin.kirkland@us.ibm.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      8c8570fb
    • D
      [PATCH] Exclude messages by message type · c8edc80c
      Dustin Kirkland 提交于
          - Add a new, 5th filter called "exclude".
          - And add a new field AUDIT_MSGTYPE.
          - Define a new function audit_filter_exclude() that takes a message type
            as input and examines all rules in the filter.  It returns '1' if the
            message is to be excluded, and '0' otherwise.
          - Call the audit_filter_exclude() function near the top of
            audit_log_start() just after asserting audit_initialized.  If the
            message type is not to be audited, return NULL very early, before
            doing a lot of work.
      [combined with followup fix for bug in original patch, Nov 4, same author]
      [combined with later renaming AUDIT_FILTER_EXCLUDE->AUDIT_FILTER_TYPE
      and audit_filter_exclude() -> audit_filter_type()]
      Signed-off-by: NDustin Kirkland <dustin.kirkland@us.ibm.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      c8edc80c
    • A
      [PATCH] Collect more inode information during syscall processing. · 73241ccc
      Amy Griffis 提交于
      This patch augments the collection of inode info during syscall
      processing. It represents part of the functionality that was provided
      by the auditfs patch included in RHEL4.
      
      Specifically, it:
      
      - Collects information for target inodes created or removed during
        syscalls.  Previous code only collects information for the target
        inode's parent.
      
      - Adds the audit_inode() hook to syscalls that operate on a file
        descriptor (e.g. fchown), enabling audit to do inode filtering for
        these calls.
      
      - Modifies filtering code to check audit context for either an inode #
        or a parent inode # matching a given rule.
      
      - Modifies logging to provide inode # for both parent and child.
      
      - Protect debug info from NULL audit_names.name.
      
      [AV: folded a later typo fix from the same author]
      Signed-off-by: NAmy Griffis <amy.griffis@hp.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      73241ccc
    • A
      [PATCH] Pass dentry, not just name, in fsnotify creation hooks. · f38aa942
      Amy Griffis 提交于
      The audit hooks (to be added shortly) will want to see dentry->d_inode
      too, not just the name.
      Signed-off-by: NAmy Griffis <amy.griffis@hp.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      f38aa942
    • S
      [PATCH] Define new range of userspace messages. · 90d526c0
      Steve Grubb 提交于
      The attached patch updates various items for the new user space
      messages. Please apply.
      Signed-off-by: NSteve Grubb <sgrubb@redhat.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      90d526c0
    • D
      [PATCH] Filter rule comparators · b63862f4
      Dustin Kirkland 提交于
      Currently, audit only supports the "=" and "!=" operators in the -F
      filter rules.
      
      This patch reworks the support for "=" and "!=", and adds support
      for ">", ">=", "<", and "<=".
      
      This turned out to be a pretty clean, and simply process.  I ended up
      using the high order bits of the "field", as suggested by Steve and Amy.
      This allowed for no changes whatsoever to the netlink communications.
      See the documentation within the patch in the include/linux/audit.h
      area, where there is a table that explains the reasoning of the bitmask
      assignments clearly.
      
      The patch adds a new function, audit_comparator(left, op, right).
      This function will perform the specified comparison (op, which defaults
      to "==" for backward compatibility) between two values (left and right).
      If the negate bit is on, it will negate whatever that result was.  This
      value is returned.
      Signed-off-by: NDustin Kirkland <dustin.kirkland@us.ibm.com>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      b63862f4
    • R
      [PATCH] AUDIT: kerneldoc for kernel/audit*.c · b0dd25a8
      Randy Dunlap 提交于
      - add kerneldoc for non-static functions;
      - don't init static data to 0;
      - limit lines to < 80 columns;
      - fix long-format style;
      - delete whitespace at end of some lines;
      
      (chrisw: resend and update to current audit-2.6 tree)
      Signed-off-by: NRandy Dunlap <rdunlap@xenotime.net>
      Signed-off-by: NChris Wright <chrisw@osdl.org>
      Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
      b0dd25a8
    • J
      [PATCH] make vm86 call audit_syscall_exit · 7e7f8a03
      Jason Baron 提交于
      hi,
      
      The motivation behind the patch below was to address messages in
      /var/log/messages such as:
      
      Jan 31 10:54:15 mets kernel: audit(:0): major=252 name_count=0: freeing
      multiple contexts (1)
      Jan 31 10:54:15 mets kernel: audit(:0): major=113 name_count=0: freeing
      multiple contexts (2)
      
      I can reproduce by running 'get-edid' from:
      http://john.fremlin.de/programs/linux/read-edid/.
      
      These messages come about in the log b/c the vm86 calls do not exit via
      the normal system call exit paths and thus do not call
      'audit_syscall_exit'. The next system call will then free the context for
      itself and for the vm86 context, thus generating the above messages. This
      patch addresses the issue by simply adding a call to 'audit_syscall_exit'
      from the vm86 code.
      
      Besides fixing the above error messages the patch also now allows vm86
      system calls to become auditable. This is useful since strace does not
      appear to properly record the return values from sys_vm86.
      
      I think this patch is also a step in the right direction in terms of
      cleaning up some core auditing code. If we can correct any other paths
      that do not properly call the audit exit and entries points, then we can
      also eliminate the notion of context chaining.
      
      I've tested this patch by verifying that the log messages no longer
      appear, and that the audit records for sys_vm86 appear to be correct.
      Also, 'read_edid' produces itentical output.
      
      thanks,
      
      -Jason
      Signed-off-by: NJason Baron <jbaron@redhat.com>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      7e7f8a03
  2. 19 3月, 2006 1 次提交
  3. 17 3月, 2006 3 次提交
  4. 14 3月, 2006 1 次提交
    • G
      [PATCH] Fix sigaltstack corruption among cloned threads · f9a3879a
      GOTO Masanori 提交于
      This patch fixes alternate signal stack corruption among cloned threads
      with CLONE_SIGHAND (and CLONE_VM) for linux-2.6.16-rc6.
      
      The value of alternate signal stack is currently inherited after a call of
      clone(...  CLONE_SIGHAND | CLONE_VM).  But if sigaltstack is set by a
      parent thread, and then if multiple cloned child threads (+ parent threads)
      call signal handler at the same time, some threads may be conflicted -
      because they share to use the same alternative signal stack region.
      Finally they get sigsegv.  It's an undesirable race condition.  Note that
      child threads created from NPTL pthread_create() also hit this conflict
      when the parent thread uses sigaltstack, without my patch.
      
      To fix this problem, this patch clears the child threads' sigaltstack
      information like exec().  This behavior follows the SUSv3 specification.
      In SUSv3, pthread_create() says "The alternate stack shall not be inherited
      (when new threads are initialized)".  It means that sigaltstack should be
      cleared when sigaltstack memory space is shared by cloned threads with
      CLONE_SIGHAND.
      
      Note that I chose "if (clone_flags & CLONE_SIGHAND)" line because:
        - If clone_flags line is not existed, fork() does not inherit sigaltstack.
        - CLONE_VM is another choice, but vfork() does not inherit sigaltstack.
        - CLONE_SIGHAND implies CLONE_VM, and it looks suitable.
        - CLONE_THREAD is another candidate, and includes CLONE_SIGHAND + CLONE_VM,
          but this flag has a bit different semantics.
      I decided to use CLONE_SIGHAND.
      
      [ Changed to test for CLONE_VM && !CLONE_VFORK after discussion --Linus ]
      Signed-off-by: NGOTO Masanori <gotom@sanori.org>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Acked-by: NLinus Torvalds <torvalds@osdl.org>
      Cc: Ulrich Drepper <drepper@redhat.com>
      Cc: Jakub Jelinek <jakub@redhat.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      f9a3879a
  5. 12 3月, 2006 1 次提交
  6. 09 3月, 2006 3 次提交
    • D
      [PATCH] fix file counting · 529bf6be
      Dipankar Sarma 提交于
      I have benchmarked this on an x86_64 NUMA system and see no significant
      performance difference on kernbench.  Tested on both x86_64 and powerpc.
      
      The way we do file struct accounting is not very suitable for batched
      freeing.  For scalability reasons, file accounting was
      constructor/destructor based.  This meant that nr_files was decremented
      only when the object was removed from the slab cache.  This is susceptible
      to slab fragmentation.  With RCU based file structure, consequent batched
      freeing and a test program like Serge's, we just speed this up and end up
      with a very fragmented slab -
      
      llm22:~ # cat /proc/sys/fs/file-nr
      587730  0       758844
      
      At the same time, I see only a 2000+ objects in filp cache.  The following
      patch I fixes this problem.
      
      This patch changes the file counting by removing the filp_count_lock.
      Instead we use a separate percpu counter, nr_files, for now and all
      accesses to it are through get_nr_files() api.  In the sysctl handler for
      nr_files, we populate files_stat.nr_files before returning to user.
      
      Counting files as an when they are created and destroyed (as opposed to
      inside slab) allows us to correctly count open files with RCU.
      Signed-off-by: NDipankar Sarma <dipankar@in.ibm.com>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      529bf6be
    • D
      [PATCH] rcu batch tuning · 21a1ea9e
      Dipankar Sarma 提交于
      This patch adds new tunables for RCU queue and finished batches.  There are
      two types of controls - number of completed RCU updates invoked in a batch
      (blimit) and monitoring for high rate of incoming RCUs on a cpu (qhimark,
      qlowmark).
      
      By default, the per-cpu batch limit is set to a small value.  If the input
      RCU rate exceeds the high watermark, we do two things - force quiescent
      state on all cpus and set the batch limit of the CPU to INTMAX.  Setting
      batch limit to INTMAX forces all finished RCUs to be processed in one shot.
       If we have more than INTMAX RCUs queued up, then we have bigger problems
      anyway.  Once the incoming queued RCUs fall below the low watermark, the
      batch limit is set to the default.
      Signed-off-by: NDipankar Sarma <dipankar@in.ibm.com>
      Cc: "Paul E. McKenney" <paulmck@us.ibm.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      21a1ea9e
    • I
      [PATCH] idle threads should have a sane ->timestamp value · 81c29a85
      Ingo Molnar 提交于
      Idle threads should have a sane ->timestamp value, to avoid init kernel
      thread(s) from inheriting it and causing miscalculations in
      try_to_wake_up().
      
      Reported-by: Mike Galbraith <efault@gmx.de>.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Cc: Nick Piggin <nickpiggin@yahoo.com.au>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      81c29a85
  7. 07 3月, 2006 3 次提交
  8. 03 3月, 2006 2 次提交
  9. 01 3月, 2006 1 次提交
  10. 21 2月, 2006 4 次提交
  11. 19 2月, 2006 1 次提交
  12. 18 2月, 2006 4 次提交
    • R
      [PATCH] swsusp: fix breakage with swap on LVM · a8534adb
      Rafael J. Wysocki 提交于
      Restore the compatibility with the older code and make it possible to
      suspend if the kernel command line doesn't contain the "resume=" argument
      Signed-off-by: NRafael J. Wysocki <rjw@sisk.pl>
      Cc: Pavel Machek <pavel@ucw.cz>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a8534adb
    • I
      [PATCH] Introduce CONFIG_DEFAULT_MIGRATION_COST · 4bbf39c2
      Ingo Molnar 提交于
      Heiko Carstens <heiko.carstens@de.ibm.com> wrote:
      
        The boot sequence on s390 sometimes takes ages and we spend a very long
        time (up to one or two minutes) in calibrate_migration_costs.  The time
        spent there differs from boot to boot.  Also the calculated costs differ
        a lot.  I've seen differences by up to a factor of 15 (yes, factor not
        percent).  Also I doubt that making these measurements make much sense on
        a completely virtualized architecture where you cannot tell how much cpu
        time you will get anyway.
      
      So introduce the CONFIG_DEFAULT_MIGRATION_COST method for an architecture
      to set the scheduler migration costs.  This turns off automatic detection
      of migration costs.  Makes sense on virtual platforms, where migration
      costs are hard to measure accurately.
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      4bbf39c2
    • P
      [PATCH] Provide an interface for getting the current tick length · 726c14bf
      Paul Mackerras 提交于
      This provides an interface for arch code to find out how many
      nanoseconds are going to be added on to xtime by the next call to
      do_timer.  The value returned is a fixed-point number in 52.12 format
      in nanoseconds.  The reason for this format is that it gives the
      full precision that the timekeeping code is using internally.
      
      The motivation for this is to fix a problem that has arisen on 32-bit
      powerpc in that the value returned by do_gettimeofday drifts apart
      from xtime if NTP is being used.  PowerPC is now using a lockless
      do_gettimeofday based on reading the timebase register and performing
      some simple arithmetic.  (This method of getting the time is also
      exported to userspace via the VDSO.)  However, the factor and offset
      it uses were calculated based on the nominal tick length and weren't
      being adjusted when NTP varied the tick length.
      
      Note that 64-bit powerpc has had the lockless do_gettimeofday for a
      long time now.  It also had an extremely hairy routine that got called
      from the 32-bit compat routine for adjtimex, which adjusted the
      factor and offset according to what it thought the timekeeping code
      was going to do.  Not only was this only called if a 32-bit task did
      adjtimex (i.e. not if a 64-bit task did adjtimex), it was also
      duplicating computations from kernel/timer.c and it wasn't clear that
      it was (still) correct.
      
      The simple solution is to ask the timekeeping code how long the
      current jiffy will be on each timer interrupt, after calling
      do_timer.  If this jiffy will be a different length from the last one,
      we then need to compute new values for the factor and offset used in
      the lockless do_gettimeofday.  In this way we can keep xtime and
      do_gettimeofday in sync, even when NTP is varying the tick length.
      
      Note that when adjtimex varies the tick length, it almost always
      introduces the variation from the next tick on.  The only case I could
      see where adjtimex would vary the length of the current tick is when
      an old-style adjtime adjustment is being cancelled.  (It's not clear
      to me why the adjustment has to be cancelled immediately rather than
      from the next tick on.)  Thus I don't see any real need for a hook in
      adjtimex; the rare case of an old-style adjustment being cancelled can
      be fixed up at the next tick.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Acked-by: Njohn stultz <johnstul@us.ibm.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      726c14bf
    • A
      [PATCH] x86_64: Add boot option to disable randomized mappings and cleanup · a62eaf15
      Andi Kleen 提交于
      AMD SimNow!'s JIT doesn't like them at all in the guest. For distribution
      installation it's easiest if it's a boot time option.
      
      Also I moved the variable to a more appropiate place and make
      it independent from sysctl
      
      And marked __read_mostly which it is.
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      a62eaf15
  13. 16 2月, 2006 4 次提交
    • A
      [PATCH] swsusp: nuke noisy message · c8adb494
      Andrew Morton 提交于
      I get about 88 squillion of these when suspending an old ad450nx server.
      
      Cc: Pavel Roskin <proski@gnu.org>
      Cc: "Rafael J. Wysocki" <rjw@sisk.pl>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      c8adb494
    • P
      [PATCH] cpuset: oops in exit on null cpuset fix · 06fed338
      Paul Jackson 提交于
      Fix a latent bug in cpuset_exit() handling.  If a task tried to allocate
      memory after calling cpuset_exit(), it oops'd in
      cpuset_update_task_memory_state() on a NULL cpuset pointer.
      
      So set the exiting tasks cpuset to the root cpuset instead of to NULL.
      
      A distro kernel hit this with an added kernel package that had just such a
      hook (allocating memory) in the exit code path.
      Signed-off-by: NPaul Jackson <pj@sgi.com>
      Signed-off-by: NAndrew Morton <akpm@osdl.org>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      06fed338
    • O
      [PATCH] fix zap_thread's ptrace related problems · 5ecfbae0
      Oleg Nesterov 提交于
      1. The tracee can go from ptrace_stop() to do_signal_stop()
         after __ptrace_unlink(p).
      
      2. It is unsafe to __ptrace_unlink(p) while p->parent may wait
         for tasklist_lock in ptrace_detach().
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Christoph Hellwig <hch@lst.de>
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      5ecfbae0
    • O
      [PATCH] fix kill_proc_info() vs fork() theoretical race · dadac81b
      Oleg Nesterov 提交于
      copy_process:
      
      	attach_pid(p, PIDTYPE_PID, p->pid);
      	attach_pid(p, PIDTYPE_TGID, p->tgid);
      
      What if kill_proc_info(p->pid) happens in between?
      
      copy_process() holds current->sighand.siglock, so we are safe
      in CLONE_THREAD case, because current->sighand == p->sighand.
      
      Otherwise, p->sighand is unlocked, the new process is already
      visible to the find_task_by_pid(), but have a copy of parent's
      'struct pid' in ->pids[PIDTYPE_TGID].
      
      This means that __group_complete_signal() may hang while doing
      
      	do ... while (next_thread() != p)
      
      We can solve this problem if we reverse these 2 attach_pid()s:
      
      	attach_pid() does wmb()
      
      	group_send_sig_info() calls spin_lock(), which
      	provides a read barrier. // Yes ?
      
      I don't think we can hit this race in practice, but still.
      Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Signed-off-by: NLinus Torvalds <torvalds@osdl.org>
      dadac81b