提交 · 55c888d6d09a0df236adfaf8ccf06ff5d0646775 · gsplhtlxg / clone-Linux

24 6月, 2005 2 次提交

[PATCH] timers fixes/improvements · 55c888d6

由 Oleg Nesterov 提交于 6月 23, 2005

This patch tries to solve following problems:

1. del_timer_sync() is racy. The timer can be fired again after
   del_timer_sync have checked all cpus and before it will recheck
   timer_pending().

2. It has scalability problems. All cpus are scanned to determine
   if the timer is running on that cpu.

   With this patch del_timer_sync is O(1) and no slower than plain
   del_timer(pending_timer), unless it has to actually wait for
   completion of the currently running timer.

   The only restriction is that the recurring timer should not use
   add_timer_on().

3. The timers are not serialized wrt to itself.

   If CPU_0 does mod_timer(jiffies+1) while the timer is currently
   running on CPU 1, it is quite possible that local interrupt on
   CPU_0 will start that timer before it finished on CPU_1.

4. The timers locking is suboptimal. __mod_timer() takes 3 locks
   at once and still requires wmb() in del_timer/run_timers.

   The new implementation takes 2 locks sequentially and does not
   need memory barriers.

Currently ->base != NULL means that the timer is pending. In that case
->base.lock is used to lock the timer. __mod_timer also takes timer->lock
because ->base can be == NULL.

This patch uses timer->entry.next != NULL as indication that the timer is
pending. So it does __list_del(), entry->next = NULL instead of list_del()
when the timer is deleted.

The ->base field is used for hashed locking only, it is initialized
in init_timer() which sets ->base = per_cpu(tvec_bases). When the
tvec_bases.lock is locked, it means that all timers which are tied
to this base via timer->base are locked, and the base itself is locked
too.

So __run_timers/migrate_timers can safely modify all timers which could
be found on ->tvX lists (pending timers).

When the timer's base is locked, and the timer removed from ->entry list
(which means that _run_timers/migrate_timers can't see this timer), it is
possible to set timer->base = NULL and drop the lock: the timer remains
locked.

This patch adds lock_timer_base() helper, which waits for ->base != NULL,
locks the ->base, and checks it is still the same.

__mod_timer() schedules the timer on the local CPU and changes it's base.
However, it does not lock both old and new bases at once. It locks the
timer via lock_timer_base(), deletes the timer, sets ->base = NULL, and
unlocks old base. Then __mod_timer() locks new_base, sets ->base = new_base,
and adds this timer. This simplifies the code, because AB-BA deadlock is not
possible. __mod_timer() also ensures that the timer's base is not changed
while the timer's handler is running on the old base.

__run_timers(), del_timer() do not change ->base anymore, they only clear
pending flag.

So del_timer_sync() can test timer->base->running_timer == timer to detect
whether it is running or not.

We don't need timer_list->lock anymore, this patch kills it.

We also don't need barriers. del_timer() and __run_timers() used smp_wmb()
before clearing timer's pending flag. It was needed because __mod_timer()
did not lock old_base if the timer is not pending, so __mod_timer()->list_add()
could race with del_timer()->list_del(). With this patch these functions are
serialized through base->lock.

One problem. TIMER_INITIALIZER can't use per_cpu(tvec_bases). So this patch
adds global

        struct timer_base_s {
                spinlock_t lock;
                struct timer_list *running_timer;
        } __init_timer_base;

which is used by TIMER_INITIALIZER. The corresponding fields in tvec_t_base_s
struct are replaced by struct timer_base_s t_base.

It is indeed ugly. But this can't have scalability problems. The global
__init_timer_base.lock is used only when __mod_timer() is called for the first
time AND the timer was compile time initialized. After that the timer migrates
to the local CPU.
Signed-off-by: NOleg Nesterov <oleg@tv-sign.ru>
Acked-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NRenaud Lienhart <renaud.lienhart@free.fr>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

55c888d6

[PATCH] i386: Selectable Frequency of the Timer Interrupt · 59121003

由 Christoph Lameter 提交于 6月 23, 2005

Make the timer frequency selectable. The timer interrupt may cause bus
and memory contention in large NUMA systems since the interrupt occurs
on each processor HZ times per second.
Signed-off-by: NChristoph Lameter <christoph@lameter.com>
Signed-off-by: NShai Fultheim <shai@scalex86.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

59121003

22 6月, 2005 6 次提交

[PATCH] uml: make hw_controller_type->release exist only for archs needing it · b77d6adc

由 Paolo 'Blaisorblade' Giarrusso 提交于 6月 21, 2005

With Chris Wedgwood <cw@f00f.org>

As suggested by Chris, we can make the "just added" method ->release
conditional to UML only (better: to archs requesting it, i.e.  only UML
currently), so that other archs don't get this unneeded crud, and if UML
won't need it any more we can kill this.
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b77d6adc

[PATCH] uml: add and use generic hw_controller_type->release · dbce706e

由 Paolo 'Blaisorblade' Giarrusso 提交于 6月 21, 2005

With Chris Wedgwood <cw@f00f.org>

Currently UML must explicitly call the UML-specific
free_irq_by_irq_and_dev() for each free_irq call it's done.

This is needed because ->shutdown and/or ->disable are only called when the
last "action" for that irq is removed.

Instead, for UML shared IRQs (UML IRQs are very often, if not always,
shared), for each dev_id some setup is done, which must be cleared on the
release of that fd.  For instance, for each open console a new instance
(i.e.  new dev_id) of the same IRQ is requested().

Exactly, a fd is stored in an array (pollfds), which is after read by a
host thread and passed to poll().  Each event registered by poll() triggers
an interrupt.  So, for each free_irq() we must remove the corresponding
host fd from the table, which we do via this -release() method.

In this patch we add an appropriate hook for this, and remove all uses of
it by pointing the hook to the said procedure; this is safe to do since the
said procedure.

Also some cosmetic improvements are included.

This is heavily based on some work by Chris Wedgwood, which however didn't
get the patch merged for something I'd call a "misunderstanding" (the need
for this patch wasn't cleanly explained, thus adding the generic hook was
felt as undesirable).
Signed-off-by: NPaolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dbce706e

[PATCH] dup_mmap: update comment on new vma · 45918e1a

由 Hugh Dickins 提交于 6月 21, 2005

Remove part of comment on linking new vma in dup_mmap: since anon_vma rmap
came in, try_to_unmap_one knows the vma without needing find_vma.  But add
a comment to note that here vma is inserted without mmap_sem.
Signed-off-by: NHugh Dickins <hugh@veritas.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

45918e1a

[PATCH] Avoiding mmap fragmentation · 1363c3cd

由 Wolfgang Wander 提交于 6月 21, 2005

Ingo recently introduced a great speedup for allocating new mmaps using the
free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
causes huge performance increases in thread creation.

The downside of this patch is that it does lead to fragmentation in the
mmap-ed areas (visible via /proc/self/maps), such that some applications
that work fine under 2.4 kernels quickly run out of memory on any 2.6
kernel.

The problem is twofold:

  1) the free_area_cache is used to continue a search for memory where
     the last search ended.  Before the change new areas were always
     searched from the base address on.

     So now new small areas are cluttering holes of all sizes
     throughout the whole mmap-able region whereas before small holes
     tended to close holes near the base leaving holes far from the base
     large and available for larger requests.

  2) the free_area_cache also is set to the location of the last
     munmap-ed area so in scenarios where we allocate e.g.  five regions of
     1K each, then free regions 4 2 3 in this order the next request for 1K
     will be placed in the position of the old region 3, whereas before we
     appended it to the still active region 1, placing it at the location
     of the old region 2.  Before we had 1 free region of 2K, now we only
     get two free regions of 1K -> fragmentation.

The patch addresses thes issues by introducing yet another cache descriptor
cached_hole_size that contains the largest known hole size below the
current free_area_cache.  If a new request comes in the size is compared
against the cached_hole_size and if the request can be filled with a hole
below free_area_cache the search is started from the base instead.

The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
(earlier posted) leakme.c test program terminates after 50000+ iterations
with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
(as expected) with thread creation, Ingo's test_str02 with 20000 threads
requires 0.7s system time.

Taking out Ingo's patch (un-patch available per request) by basically
deleting all mentions of free_area_cache from the kernel and starting the
search for new memory always at the respective bases we observe: leakme
terminates successfully with 11 distinctive hardly fragmented areas in
/proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
time for Ingo's test_str02 with 20000 threads.

Now - drumroll ;-) the appended patch works fine with leakme: it ends with
only 7 distinct areas in /proc/self/maps and also thread creation seems
sufficiently fast with 0.71s for 20000 threads.
Signed-off-by: NWolfgang Wander <wwc@rentec.com>
Credit-to: "Richard Purdie" <rpurdie@rpsys.net>
Signed-off-by: NKen Chen <kenneth.w.chen@intel.com>
Acked-by: Ingo Molnar <mingo@elte.hu> (partly)
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

1363c3cd

[PATCH] VM: early zone reclaim · 753ee728

由 Martin Hicks 提交于 6月 21, 2005

This is the core of the (much simplified) early reclaim.  The goal of this
patch is to reclaim some easily-freed pages from a zone before falling back
onto another zone.

One of the major uses of this is NUMA machines.  With the default allocator
behavior the allocator would look for memory in another zone, which might be
off-node, before trying to reclaim from the current zone.

This adds a zone tuneable to enable early zone reclaim.  It is selected on a
per-zone basis and is turned on/off via syscall.

Adding some extra throttling on the reclaim was also required (patch
4/4).  Without the machine would grind to a crawl when doing a "make -j"
kernel build.  Even with this patch the System Time is higher on
average, but it seems tolerable.  Here are some numbers for kernbench
runs on a 2-node, 4cpu, 8Gig RAM Altix in the "make -j" run:

			wall  user   sys   %cpu  ctx sw.  sleeps
			----  ----   ---   ----   ------  ------
No patch		1009  1384   847   258   298170   504402
w/patch, no reclaim     880   1376   667   288   254064   396745
w/patch & reclaim       1079  1385   926   252   291625   548873

These numbers are the average of 2 runs of 3 "make -j" runs done right
after system boot.  Run-to-run variability for "make -j" is huge, so
these numbers aren't terribly useful except to seee that with reclaim
the benchmark still finishes in a reasonable amount of time.

I also looked at the NUMA hit/miss stats for the "make -j" runs and the
reclaim doesn't make any difference when the machine is thrashing away.

Doing a "make -j8" on a single node that is filled with page cache pages
takes 700 seconds with reclaim turned on and 735 seconds without reclaim
(due to remote memory accesses).

The simple zone_reclaim syscall program is at
http://www.bork.org/~mort/sgi/zone_reclaim.cSigned-off-by: NMartin Hicks <mort@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

753ee728

[PATCH] smp_processor_id() cleanup · 39c715b7

由 Ingo Molnar 提交于 6月 21, 2005

This patch implements a number of smp_processor_id() cleanup ideas that
Arjan van de Ven and I came up with.

The previous __smp_processor_id/_smp_processor_id/smp_processor_id API
spaghetti was hard to follow both on the implementational and on the
usage side.

Some of the complexity arose from picking wrong names, some of the
complexity comes from the fact that not all architectures defined
__smp_processor_id.

In the new code, there are two externally visible symbols:

 - smp_processor_id(): debug variant.

 - raw_smp_processor_id(): nondebug variant. Replaces all existing
   uses of _smp_processor_id() and __smp_processor_id(). Defined
   by every SMP architecture in include/asm-*/smp.h.

There is one new internal symbol, dependent on DEBUG_PREEMPT:

 - debug_smp_processor_id(): internal debug variant, mapped to
                             smp_processor_id().

Also, i moved debug_smp_processor_id() from lib/kernel_lock.c into a new
lib/smp_processor_id.c file.  All related comments got updated and/or
clarified.

I have build/boot tested the following 8 .config combinations on x86:

 {SMP,UP} x {PREEMPT,!PREEMPT} x {DEBUG_PREEMPT,!DEBUG_PREEMPT}

I have also build/boot tested x64 on UP/PREEMPT/DEBUG_PREEMPT.  (Other
architectures are untested, but should work just fine.)
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NArjan van de Ven <arjan@infradead.org>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

39c715b7

21 6月, 2005 1 次提交

[PATCH] sysfs: (rest) if show/store is missing return -EIO · 70f2817a

由 Dmitry Torokhov 提交于 4月 29, 2005

sysfs: fix the rest of the kernel so if an attribute doesn't
       implement show or store method read/write will return
       -EIO instead of 0 or -EINVAL or -EPERM.
Signed-off-by: NDmitry Torokhov <dtor@mail.ru>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

70f2817a

18 6月, 2005 1 次提交

[PATCH] timer exit cleanup · caf2857a

由 Ingo Molnar 提交于 6月 17, 2005

Do all timer zapping in exit_itimers.
Signed-off-by: NIngo Molnar <mingo@elte.hu>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

caf2857a

14 6月, 2005 1 次提交

[PATCH] cond_resched_lock() fix · 6df3cecb

由 Jan Kara 提交于 6月 13, 2005

On one path, cond_resched_lock() fails to return true if it dropped the lock.
We think this might be causing the crashes in JBD's log_do_checkpoint().

Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

6df3cecb

01 6月, 2005 1 次提交

[PATCH] flush icache in correct context · ae92ef8a

由 Roman Zippel 提交于 5月 31, 2005

flush_icache_range() is used in two different situation - in binfmt_elf.c &
co for user space mappings and module.c for kernel modules.  On m68k
flush_icache_range() doesn't know which data to flush, as it has separate
address spaces and the pointer argument can be valid in either address
space.

First I considered splitting flush_icache_range(), but this patch is
simpler.  Setting the correct context gives flush_icache_range() enough
information to flush the correct data.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

ae92ef8a

29 5月, 2005 1 次提交

[PATCH] drop note_interrupt() for per-CPU for proper scaling · b60c1f6f

由 John Hawkes 提交于 5月 27, 2005

The "unhandled interrupts" catcher, note_interrupt(), increments a global
desc->irq_count and grossly damages scaling of very large systems, e.g.,
>192p ia64 Altix, because of this highly contented cacheline, especially
for timer interrupts.  384p is severely crippled, and 512p is unuseable.

All calls to note_interrupt() can be disabled by booting with "noirqdebug",
but this disables the useful interrupt checking for all interrupts.

I propose eliminating note_interrupt() for all per-CPU interrupts.  This
was the behavior of linux-2.6.10 and earlier, but in 2.6.11 a code
restructuring added a call to note_interrupt() for per-CPU interrupts.
Besides, note_interrupt() is a bit racy for concurrent CPU calls anyway, as
the desc->irq_count++ increment isn't atomic (which, if done, would make
scaling even worse).
Signed-off-by: NJohn Hawkes <hawkes@sgi.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b60c1f6f

27 5月, 2005 2 次提交

[PATCH] cpuset exit NULL dereference fix · 2efe86b8

由 Paul Jackson 提交于 5月 27, 2005

There is a race in the kernel cpuset code, between the code
to handle notify_on_release, and the code to remove a cpuset.
The notify_on_release code can end up trying to access a
cpuset that has been removed.  In the most common case, this
causes a NULL pointer dereference from the routine cpuset_path.
However all manner of bad things are possible, in theory at least.

The existing code decrements the cpuset use count, and if the
count goes to zero, processes the notify_on_release request,
if appropriate.  However, once the count goes to zero, unless we
are holding the global cpuset_sem semaphore, there is nothing to
stop another task from immediately removing the cpuset entirely,
and recycling its memory.

The obvious fix would be to always hold the cpuset_sem
semaphore while decrementing the use count and dealing with
notify_on_release.  However we don't want to force a global
semaphore into the mainline task exit path, as that might create
a scaling problem.

The actual fix is almost as easy - since this is only an issue
for cpusets using notify_on_release, which the top level big
cpusets don't normally need to use, only take the cpuset_sem
for cpusets using notify_on_release.

This code has been run for hours without a hiccup, while running
a cpuset create/destroy stress test that could crash the existing
kernel in seconds.  This patch applies to the current -linus
git kernel.
Signed-off-by: NPaul Jackson <pj@sgi.com>
Acked-by: NSimon Derr <simon.derr@bull.net>
Acked-by: NDinakar Guniguntala <dino@in.ibm.com>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

2efe86b8

D
AUDIT: Record working directory when syscall arguments are pathnames · 8f37d47c
由 David Woodhouse 提交于 5月 27, 2005
```
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
```
8f37d47c

26 5月, 2005 1 次提交

AUDIT: Defer freeing aux items until audit_free_context() · 7551ced3

由 David Woodhouse 提交于 5月 26, 2005

While they were all just simple blobs it made sense to just free them
as we walked through and logged them. Now that there are pointers to
other objects which need refcounting, we might as well revert to
_only_ logging them in audit_log_exit(), and put the code to free them
properly in only one place -- in audit_free_aux().
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
----------------------------------------------------------

7551ced3

25 5月, 2005 1 次提交

[PATCH] sigkill priority fix · c33880aa

由 Kirill Korotaev 提交于 5月 24, 2005

If SIGKILL does not have priority, we cannot instantly kill task before it
makes some unexpected job.  It can be critical, but we were unable to
reproduce this easily until Heiko Carstens <Heiko.Carstens@de.ibm.com>
reported this problem on LKML.
Signed-Off-By: NKirill Korotaev <dev@sw.ru>
Signed-Off-By: NAlexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

c33880aa

24 5月, 2005 2 次提交

AUDIT: Escape comm when logging task info · 99e45eea

由 David Woodhouse 提交于 5月 23, 2005

It comes from the user; it needs to be escaped.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

99e45eea

AUDIT: Unify auid reporting, put arch before syscall number · bccf6ae0

由 David Woodhouse 提交于 5月 23, 2005

These changes make processing of audit logs easier. Based on a patch
from Steve Grubb <sgrubb@redhat.com>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

bccf6ae0

22 5月, 2005 2 次提交

AUDIT: Assign serial number to non-syscall messages · bfb4496e

由 David Woodhouse 提交于 5月 21, 2005

Move audit_serial() into audit.c and use it to generate serial numbers 
on messages even when there is no audit context from syscall auditing.  
This allows us to disambiguate audit records when more than one is 
generated in the same millisecond.

Based on a patch by Steve Grubb after he observed the problem.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

bfb4496e

[PATCH] spin_unlock_bh() and preempt_check_resched() · 10f02d1c

由 Samuel Thibault 提交于 5月 21, 2005

In _spin_unlock_bh(lock):
	do { \
		_raw_spin_unlock(lock); \
		preempt_enable(); \
		local_bh_enable(); \
		__release(lock); \
	} while (0)

there is no reason for using preempt_enable() instead of a simple
preempt_enable_no_resched()

Since we know bottom halves are disabled, preempt_schedule() will always
return at once (preempt_count!=0), and hence preempt_check_resched() is
useless here...

This fixes it by using "preempt_enable_no_resched()" instead of the
"preempt_enable()", and thus avoids the useless preempt_check_resched()
just before re-enabling bottom halves.
Signed-off-by: NSamuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

10f02d1c

21 5月, 2005 4 次提交

AUDIT: Fix inconsistent use of loginuid vs. auid, signed vs. unsigned · 326e9c8b

由 Steve Grubb 提交于 5月 21, 2005

The attached patch changes all occurrences of loginuid to auid. It also 
changes everything to %u that is an unsigned type.
Signed-off-by: NSteve Grubb <sgrubb@redhat.com>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

326e9c8b

AUDIT: Fix AVC_USER message passing. · 05474106

由 Steve Grubb 提交于 5月 21, 2005

The original AVC_USER message wasn't consolidated with the new range of
user messages. The attached patch fixes the kernel so the old messages 
work again.
Signed-off-by: NSteve Grubb <sgrubb@redhat.com>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

05474106

AUDIT: Avoid sleeping function in SElinux AVC audit. · 01116105

由 Stephen Smalley 提交于 5月 21, 2005

This patch changes the SELinux AVC to defer logging of paths to the audit
framework upon syscall exit, by saving a reference to the (dentry,vfsmount)
pair in an auxiliary audit item on the current audit context for processing
by audit_log_exit.
Signed-off-by: NStephen Smalley <sds@tycho.nsa.gov>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

01116105

[PATCH] cpusets+hotplug+preepmt broken · b39c4fab

由 Paul Jackson 提交于 5月 20, 2005

This patch removes the entwining of cpusets and hotplug code in the "No
more Mr.  Nice Guy" case of sched.c move_task_off_dead_cpu().

Since the hotplug code is holding a spinlock at this point, we cannot take
the cpuset semaphore, cpuset_sem, as would seem to be required either to
update the tasks cpuset, or to scan up the nested cpuset chain, looking for
the nearest cpuset ancestor that still has some CPUs that are online.  So
we just punt and blast the tasks cpus_allowed with all bits allowed.

This reverts these lines of code to what they were before the cpuset patch.
 And it updates the cpuset Doc file, to match.

The one known alternative to this that seems to work came from Dinakar
Guniguntala, and required the hotplug code to take the cpuset_sem semaphore
much earlier in its processing.  So far as we know, the increased locking
entanglement between cpusets and hot plug of this alternative approach is
not worth doing in this case.
Signed-off-by: NPaul Jackson <pj@sgi.com>
Acked-by: NNathan Lynch <ntl@pobox.com>
Acked-by: NDinakar Guniguntala <dino@in.ibm.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

b39c4fab

19 5月, 2005 4 次提交

AUDIT: Honour audit_backlog_limit again. · fb19b4c6

由 David Woodhouse 提交于 5月 19, 2005

The limit on the number of outstanding audit messages was inadvertently
removed with the switch to queuing skbs directly for sending by a kernel
thread. Put it back again.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

fb19b4c6

AUDIT: Quis Custodiet Ipsos Custodes? · 7ca00264

由 David Woodhouse 提交于 5月 19, 2005

Nobody does. Really, it gets very silly if auditd is recording its
own actions.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

7ca00264

AUDIT: Send netlink messages from a separate kernel thread · b7d11258

由 David Woodhouse 提交于 5月 19, 2005

netlink_unicast() will attempt to reallocate and will free messages if
the socket's rcvbuf limit is reached unless we give it an infinite 
timeout. So do that, from a kernel thread which is dedicated to spewing
stuff up the netlink socket.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

b7d11258

AUDIT: Clean up logging of untrusted strings · 168b7173

由 Steve Grubb 提交于 5月 19, 2005

* If vsnprintf returns -1, it will mess up the sk buffer space accounting. 
This is fixed by not calling skb_put with bogus len values.

* audit_log_hex was a loop that called audit_log_vformat with %02X for each 
character. This is very inefficient since conversion from unsigned character 
to Ascii representation is essentially masking, shifting, and byte lookups. 
Also, the length of the converted string is well known - it's twice the 
original. Fixed by rewriting the function.

* audit_log_untrustedstring had no comments. This makes it hard for 
someone to understand what the string format will be.

* audit_log_d_path was never fixed to use untrustedstring. This could mess
up user space parsers. This was fixed to make a temp buffer, call d_path, 
and log temp buffer using untrustedstring. 

From: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

168b7173

18 5月, 2005 2 次提交

AUDIT: Treat all user messages identically. · 209aba03

由 David Woodhouse 提交于 5月 18, 2005

It's silly to have to add explicit entries for new userspace messages
as we invent them. Just treat all messages in the user range the same.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

209aba03

[PATCH] Driver Core: pm diagnostics update, check for errors · 82428b62

由 David Brownell 提交于 5月 09, 2005

This patch includes various tweaks in the messaging that appears during
system pm state transitions:

  * Warn about certain illegal calls in the device tree, like resuming
    child before parent or suspending parent before child.  This could
    happen easily enough through sysfs, or in some cases when drivers
    use device_pm_set_parent().

  * Be more consistent about dev_dbg() tracing ... do it for resume() and
    shutdown() too, and never if the driver doesn't have that method.

  * Say which type of system sleep state is being entered.

Except for the warnings, these only affect debug messaging.
Signed-off-by: NDavid Brownell <dbrownell@users.sourceforge.net>
Acked-by: NPavel Machek <pavel@ucw.cz>
Signed-off-by: NGreg Kroah-Hartman <gregkh@suse.de>

82428b62

17 5月, 2005 4 次提交

[PATCH] profile.c: `schedule' parsing fix · dfaa9c94

由 William Lee Irwin III 提交于 5月 16, 2005

profile=schedule parsing is not quite what it should be.  First, str[7] is
'e', not ',', but then even if it did fall through, prof_on =
SCHED_PROFILING would be clobbered inside if (get_option(...)) So a small
amount of rearrangement is done in this patch to correct it.
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

dfaa9c94

[PATCH] add_preferred_console() build fix · 3c0547ba

由 Matt Mackall 提交于 5月 16, 2005

Move add_preferred_console out of CONFIG_PRINTK so serial console does the
right thing.
Signed-off-by: NMatt Mackall <mpm@selenic.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

3c0547ba

[PATCH] spurious interrupt fix · 4f167fb4

由 Zhang, Yanmin 提交于 5月 16, 2005

On my IA64 machine, after kernel 2.6.12-rc3 boots, an edge-triggered
interrupt (IRQ 46) keeps triggered over and over again.  There is no IRQ 46
interrupt action handler.  It has lots of impact on performance.

Kernel 2.6.10 and its prior versions have no the problem.  Basically,
kernel 2.6.10 will mask the spurious edge interrupt if the interrupt is
triggered for the second time and its status includes
IRQ_DISABLE|IRQ_PENDING.

Originally, IA64 kernel has its own specific _irq_desc definitions in file
arch/ia64/kernel/irq.c.  The definition initiates _irq_desc[irq].status to
IRQ_DISABLE.  Since kernel 2.6.11, it was moved to architecture independent
codes, i.e.  kernel/irq/handle.c, but kernel/irq/handle.c initiates
_irq_desc[irq].status to 0 instead of IRQ_DISABLE.
Signed-off-by: NZhang Yanmin <yanmin.zhang@intel.com>
Signed-off-by: NAndrew Morton <akpm@osdl.org>
Signed-off-by: NLinus Torvalds <torvalds@osdl.org>

4f167fb4

D
AUDIT: Capture sys_socketcall arguments and sockaddrs · 3ec3b2fb
由 David Woodhouse 提交于 5月 17, 2005
```
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>
```
3ec3b2fb

14 5月, 2005 3 次提交

AUDIT: fix max_t thinko. · 5e014b10

由 David Woodhouse 提交于 5月 13, 2005

Der... if you use max_t it helps if you give it a type. 

Note to self: Always just apply the tested patches, don't try to port 
them by hand. You're not clever enough.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

5e014b10

AUDIT: Fix some spelling errors · 23f32d18

由 Steve Grubb 提交于 5月 13, 2005

I'm going through the kernel code and have a patch that corrects 
several spelling errors in comments.

From: Steve Grubb <sgrubb@redhat.com>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

23f32d18

AUDIT: Add message types to audit records · c0404993

由 Steve Grubb 提交于 5月 13, 2005

This patch adds more messages types to the audit subsystem so that audit 
analysis is quicker, intuitive, and more useful.
Signed-off-by: NSteve Grubb <sgrubb@redhat.com>
---
I forgot one type in the big patch. I need to add one for user space 
originating SE Linux avc messages. This is used by dbus and nscd.

-Steve
---
Updated to 2.6.12-rc4-mm1.
-dwmw2
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

c0404993

13 5月, 2005 1 次提交

AUDIT: Round up audit skb expansion to AUDIT_BUFSIZ. · 9ea74f06

由 David Woodhouse 提交于 5月 13, 2005

Otherwise, we will be repeatedly reallocating, even if we're only
adding a few bytes at a time. Pointed out by Steve Grubb.
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

9ea74f06

11 5月, 2005 1 次提交

Add audit_log_type · c1b773d8

由 Chris Wright 提交于 5月 11, 2005

Add audit_log_type to allow callers to specify type and pid when logging.
Convert audit_log to wrapper around audit_log_type. Could have
converted all audit_log callers directly, but common case is default
of type AUDIT_KERNEL and pid 0. Update audit_log_start to take type
and pid values when creating a new audit_buffer. Move sequences that
did audit_log_start, audit_log_format, audit_set_type, audit_log_end,
to simply call audit_log_type directly. This obsoletes audit_set_type
and audit_set_pid, so remove them.
Signed-off-by: NChris Wright <chrisw@osdl.org>
Signed-off-by: NDavid Woodhouse <dwmw2@infradead.org>

c1b773d8