1. 17 6月, 2009 5 次提交
    • L
      tracing/filters: fix race between filter setting and module unload · 00e95830
      Li Zefan 提交于
      Module unload is protected by event_mutex, while setting filter is
      protected by filter_mutex. This leads to the race:
      
      echo 'bar == 0 || bar == 10' \    |
      		> sample/filter   |
                                        |  insmod sample.ko
        add_pred("bar == 0")            |
          -> n_preds == 1               |
        add_pred("bar == 100")          |
          -> n_preds == 2               |
                                        |  rmmod sample.ko
                                        |  insmod sample.ko
        add_pred("&&")                  |
          -> n_preds == 1 (should be 3) |
      
      Now event->filter->preds is corrupted. An then when filter_match_preds()
      is called, the WARN_ON() in it will be triggered.
      
      To avoid the race, we remove filter_mutex, and replace it with event_mutex.
      
      [ Impact: prevent corruption of filters by module removing and loading ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A375A4D.6000205@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      00e95830
    • L
      tracing/filters: free filter_string in destroy_preds() · 57be8887
      Li Zefan 提交于
      filter->filter_string is not freed when unloading a module:
      
       # insmod trace-events-sample.ko
       # echo "bar < 100" > /mnt/tracing/events/sample/foo_bar/filter
       # rmmod trace-events-sample.ko
      
      [ Impact: fix memory leak when unloading module ]
      Signed-off-by: NLi Zefan <lizf@cn.fujitsu.com>
      LKML-Reference: <4A375A30.9060802@cn.fujitsu.com>
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      57be8887
    • S
      ring-buffer: use commit counters for commit pointer accounting · fa743953
      Steven Rostedt 提交于
      The ring buffer is made up of three sets of pointers.
      
      The head page pointer, which points to the next page for the reader to
      get.
      
      The commit pointer and commit index, which points to the page and index
      of the last committed write respectively.
      
      The tail pointer and tail index, which points to the page and the index
      of the last reserved data respectively (non committed).
      
      The commit pointer is only moved forward by the outer most writer.
      If a nested writer comes in, it will not move the pointer forward.
      
      The current implementation has a flaw. It assumes that the outer most
      writer successfully reserved data. There's a small race window where
      the outer most writer could find the tail pointer, but a nested
      writer could come in (via interrupt) and move the tail forward, and
      even the commit forward.
      
      The outer writer would not realized the commit moved forward and the
      accounting will break.
      
      This patch changes the design to use counters in the per cpu buffers
      to keep track of commits. The counters are incremented at the start
      of the commit, and decremented at the end. If the end commit counter
      is 1, then it moves the commit pointers. A loop is made to check for
      races between checking and moving the commit pointers. Only the outer
      commit should move the pointers anyway.
      
      The test of knowing if a reserve is equal to the last commit update
      is still needed to know for time keeping. The time code is much less
      racey than the commit updates.
      
      This change not only solves the mentioned race, but also makes the
      code simpler.
      
      [ Impact: fix commit race and simplify code ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      fa743953
    • S
      ring-buffer: remove unused variable · 263294f3
      Steven Rostedt 提交于
      Fix the compiler error:
      
      kernel/trace/ring_buffer.c: In function 'rb_move_tail':
      kernel/trace/ring_buffer.c:1236: warning: unused variable 'event'
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      263294f3
    • S
      ring-buffer: have benchmark test handle discarded events · 9086c7b9
      Steven Rostedt 提交于
      With the addition of commit:
      
        c7b09308
        ring-buffer: prevent adding write in discarded area
      
      The ring buffer may now add discarded events when a write passes
      the end of a buffer page. Before, a discarded event was only added
      when the tracer deliberately created one. The ring buffer benchmark
      test does not handle discarded events when it reads the buffer and
      fails when it encounters one.
      
      Also fix the increment for large data entries (luckily, the test did
      not add any yet).
      
      [ Impact: fix false failure of ring buffer self test ]
      Signed-off-by: NSteven Rostedt <rostedt@goodmis.org>
      9086c7b9
  2. 15 6月, 2009 6 次提交
  3. 13 6月, 2009 9 次提交
  4. 12 6月, 2009 18 次提交
    • R
      sched: export kick_process · b43e3521
      Rusty Russell 提交于
      lguest needs kick_process: wake_up_process() does nothing if a process
      is running, which isn't sufficient (we need it in the kernel).
      
      And lguest support is usually modular.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: Ingo Molnar <mingo@elte.hu>
      b43e3521
    • P
      perf_counter: Add forward/backward attribute ABI compatibility · 974802ea
      Peter Zijlstra 提交于
      Provide for means of extending the perf_counter_attr in a 'natural' way.
      
      We allow growing the structure by appending fields at the end by specifying
      the full structure size inside it.
      
      When a new kernel sees a smaller (old) structure, it will 0 pad the tail.
      When an old kernel sees a larger (new) structure, it will verify the tail
      consists of 0s, otherwise fail.
      
      If we fail due to a size-mismatch, we return -E2BIG and write the kernel's
      native attribe size back into the provided structure.
      
      Furthermore, add some attribute verification, so that we'll fail counter
      creation when unknown bits are present (PERF_SAMPLE, PERF_FORMAT, or in
      the __reserved fields).
      
      (This ABI detail is introduced while keeping the existing syscall ABI.)
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      974802ea
    • P
      perf_counter: Remove PERF_TYPE_RAW special casing · 081fad86
      Peter Zijlstra 提交于
      The PERF_TYPE_RAW special case seems superfluous these days. Remove
      it and add it to the switch() stmt like the others.
      
      [ Impact: cleanup ]
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      081fad86
    • R
      module: trim exception table on init free. · ad6561df
      Rusty Russell 提交于
      It's theoretically possible that there are exception table entries
      which point into the (freed) init text of modules.  These could cause
      future problems if other modules get loaded into that memory and cause
      an exception as we'd see the wrong fixup.  The only case I know of is
      kvm-intel.ko (when CONFIG_CC_OPTIMIZE_FOR_SIZE=n).
      
      Amerigo fixed this long-standing FIXME in the x86 version, but this
      patch is more general.
      
      This implements trim_init_extable(); most archs are simple since they
      use the standard lib/extable.c sort code.  Alpha and IA64 use relative
      addresses in their fixups, so thier trimming is a slight variation.
      
      Sparc32 is unique; it doesn't seem to define ARCH_HAS_SORT_EXTABLE,
      yet it defines its own sort_extable() which overrides the one in lib.
      It doesn't sort, so we have to mark deleted entries instead of
      actually trimming them.
      Inspired-by: NAmerigo Wang <amwang@redhat.com>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      Cc: linux-alpha@vger.kernel.org
      Cc: sparclinux@vger.kernel.org
      Cc: linux-ia64@vger.kernel.org
      ad6561df
    • R
      module_param: allow 'bool' module_params to be bool, not just int. · fddd5201
      Rusty Russell 提交于
      Impact: API cleanup
      
      For historical reasons, 'bool' parameters must be an int, not a bool.
      But there are around 600 users, so a conversion seems like useless churn.
      
      So we use __same_type() to distinguish, and handle both cases.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      fddd5201
    • R
      module_param: split perm field into flags and perm · 45fcc70c
      Rusty Russell 提交于
      Impact: cleanup
      
      Rather than hack KPARAM_KMALLOCED into the perm field, separate it out.
      Since the perm field was 32 bits and only needs 16, we don't add bloat.
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      45fcc70c
    • R
      module_param: invbool should take a 'bool', not an 'int' · 9a71af2c
      Rusty Russell 提交于
      It takes an 'int' for historical reasons, and there are only two
      users: simply switch it over to bool.
      
      The other user (uvesafb.c) will get a (harmless-on-x86) warning until
      the next patch is applied.
      
      Cc: Brad Douglas <brad@neruo.com>
      Cc: Michal Januszewski <spock@gentoo.org>
      Signed-off-by: NRusty Russell <rusty@rustcorp.com.au>
      9a71af2c
    • Y
      irq: slab alloc for default irq_affinity · 28be225b
      Yinghai Lu 提交于
      Ingo had
      
      [    0.000000] ------------[ cut here ]------------
      [    0.000000] WARNING: at mm/bootmem.c:537 alloc_arch_preferred_bootmem+0x2b/0x71()
      [    0.000000] Hardware name: System Product Name
      [    0.000000] Modules linked in:
      [    0.000000] Pid: 0, comm: swapper Tainted: G        W  2.6.30-tip-03087-g0bb2618-dirty #52506
      [    0.000000] Call Trace:
      [    0.000000]  [<81032588>] warn_slowpath_common+0x60/0x90
      [    0.000000]  [<810325c5>] warn_slowpath_null+0xd/0x10
      [    0.000000]  [<819d1bc0>] alloc_arch_preferred_bootmem+0x2b/0x71
      [    0.000000]  [<819d1c31>] ___alloc_bootmem_nopanic+0x2b/0x9a
      [    0.000000]  [<81050a0a>] ? lock_release+0xac/0xb2
      [    0.000000]  [<819d1d4c>] ___alloc_bootmem+0xe/0x2d
      [    0.000000]  [<819d1e9f>] __alloc_bootmem+0xa/0xc
      [    0.000000]  [<819d7c63>] alloc_bootmem_cpumask_var+0x21/0x26
      [    0.000000]  [<819d0cc8>] early_irq_init+0x15/0x10d
      [    0.000000]  [<819bb75a>] start_kernel+0x167/0x326
      [    0.000000]  [<819bb06b>] __init_begin+0x6b/0x70
      [    0.000000] ---[ end trace 4eaa2a86a8e2da23 ]---
      [    0.000000] NR_IRQS:2304 nr_irqs:424
      [    0.000000] CPU 0 irqstacks, hard=821e6000 soft=821e7000
      
      we need to update init_irq_default_affinity
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      28be225b
    • A
      Push BKL down into ->remount_fs() · 337eb00a
      Alessio Igor Bogani 提交于
      [xfs, btrfs, capifs, shmem don't need BKL, exempt]
      Signed-off-by: NAlessio Igor Bogani <abogani@texware.it>
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      337eb00a
    • A
      Switch collect_mounts() to struct path · 589ff870
      Al Viro 提交于
      Signed-off-by: NAl Viro <viro@zeniv.linux.org.uk>
      589ff870
    • O
      slow_work_thread() should do the exclusive wait · b415c49a
      Oleg Nesterov 提交于
      slow_work_thread() sleeps on slow_work_thread_wq without WQ_FLAG_EXCLUSIVE,
      this means that slow_work_enqueue()->__wake_up(nr_exclusive => 1) wakes up all
      kslowd threads.  This is not what we want, so we change slow_work_thread() to
      use prepare_to_wait_exclusive() instead.
      Signed-off-by: NOleg Nesterov <oleg@redhat.com>
      Signed-off-by: NDavid Howells <dhowells@redhat.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      b415c49a
    • P
      irq: use kcalloc() instead of the bootmem allocator · 22fb4e71
      Pekka Enberg 提交于
      Fixes the following problem:
      
      [    0.000000] Experimental hierarchical RCU init done.
      [    0.000000] NR_IRQS:4352 nr_irqs:256
      [    0.000000] ------------[ cut here ]------------
      [    0.000000] WARNING: at mm/bootmem.c:537 alloc_arch_preferred_bootmem+0x40/0x7e()
      [    0.000000] Hardware name: To Be Filled By O.E.M.
      [    0.000000] Pid: 0, comm: swapper Not tainted 2.6.30-tip-02161-g7a74539-dirty #59709
      [    0.000000] Call Trace:
      [    0.000000]  [<ffffffff823f8c8e>] ? alloc_arch_preferred_bootmem+0x40/0x7e
      [    0.000000]  [<ffffffff81067168>] warn_slowpath_common+0x88/0xcb
      [    0.000000]  [<ffffffff810671d2>] warn_slowpath_null+0x27/0x3d
      [    0.000000]  [<ffffffff823f8c8e>] alloc_arch_preferred_bootmem+0x40/0x7e
      [    0.000000]  [<ffffffff823f9307>] ___alloc_bootmem_nopanic+0x4e/0xec
      [    0.000000]  [<ffffffff823f93c5>] ___alloc_bootmem+0x20/0x61
      [    0.000000]  [<ffffffff823f962e>] __alloc_bootmem+0x1e/0x34
      [    0.000000]  [<ffffffff823f757c>] early_irq_init+0x6d/0x118
      [    0.000000]  [<ffffffff823e0140>] ? early_idt_handler+0x0/0x71
      [    0.000000]  [<ffffffff823e0cf7>] start_kernel+0x192/0x394
      [    0.000000]  [<ffffffff823e0140>] ? early_idt_handler+0x0/0x71
      [    0.000000]  [<ffffffff823e02ad>] x86_64_start_reservations+0xb4/0xcf
      [    0.000000]  [<ffffffff823e0000>] ? __init_begin+0x0/0x140
      [    0.000000]  [<ffffffff823e0420>] x86_64_start_kernel+0x158/0x17b
      [    0.000000] ---[ end trace a7919e7f17c0a725 ]---
      [    0.000000] Fast TSC calibration using PIT
      [    0.000000] Detected 2002.510 MHz processor.
      [    0.004000] Console: colour VGA+ 80x25
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      22fb4e71
    • P
      sched: use slab in cpupri_init() · 0fb53029
      Pekka Enberg 提交于
      Lets not use the bootmem allocator in cpupri_init() as slab is already up when
      it is run.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      0fb53029
    • P
      sched: use alloc_cpumask_var() instead of alloc_bootmem_cpumask_var() · 4bdddf8f
      Pekka Enberg 提交于
      Slab is initialized when sched_init() runs now so lets use alloc_cpumask_var().
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Yinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      4bdddf8f
    • Y
      irq/cpumask: make memoryless node zero happy · dad213ae
      Yinghai Lu 提交于
      Don't hardcode to node zero for early boot IRQ setup memory allocations.
      
      [ penberg@cs.helsinki.fi: minor cleanups ]
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      dad213ae
    • Y
      x86: remove some alloc_bootmem_cpumask_var calling · 38c7fed2
      Yinghai Lu 提交于
      Now that we set up the slab allocator earlier, we can get rid of some
      alloc_bootmem_cpumask_var() calls in boot code.
      
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NYinghai Lu <yinghai@kernel.org>
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      38c7fed2
    • P
      sched: use kzalloc() instead of the bootmem allocator · 36b7b6d4
      Pekka Enberg 提交于
      Now that kmem_cache_init() happens before sched_init(), we should use kzalloc()
      and not the bootmem allocator.
      Signed-off-by: NPekka Enberg <penberg@cs.helsinki.fi>
      36b7b6d4
    • C
      kmemleak: Add modules support · 4f2294b6
      Catalin Marinas 提交于
      This patch handles the kmemleak operations needed for modules loading so
      that memory allocations from inside a module are properly tracked.
      Signed-off-by: NCatalin Marinas <catalin.marinas@arm.com>
      4f2294b6
  5. 11 6月, 2009 2 次提交