1. 06 4月, 2009 8 次提交
    • P
      perf_counter: new output ABI - part 1 · 7b732a75
      Peter Zijlstra 提交于
      Impact: Rework the perfcounter output ABI
      
      use sys_read() only for instant data and provide mmap() output for all
      async overflow data.
      
      The first mmap() determines the size of the output buffer. The mmap()
      size must be a PAGE_SIZE multiple of 1+pages, where pages must be a
      power of 2 or 0. Further mmap()s of the same fd must have the same
      size. Once all maps are gone, you can again mmap() with a new size.
      
      In case of 0 extra pages there is no data output and the first page
      only contains meta data.
      
      When there are data pages, a poll() event will be generated for each
      full page of data. Furthermore, the output is circular. This means
      that although 1 page is a valid configuration, its useless, since
      we'll start overwriting it the instant we report a full page.
      
      Future work will focus on the output format (currently maintained)
      where we'll likey want each entry denoted by a header which includes a
      type and length.
      
      Further future work will allow to splice() the fd, also containing the
      async overflow data -- splice() would be mutually exclusive with
      mmap() of the data.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Orig-LKML-Reference: <20090323172417.470536358@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7b732a75
    • P
      perf_counter: add an mmap method to allow userspace to read hardware counters · 37d81828
      Paul Mackerras 提交于
      Impact: new feature giving performance improvement
      
      This adds the ability for userspace to do an mmap on a hardware counter
      fd and get access to a read-only page that contains the information
      needed to translate a hardware counter value to the full 64-bit
      counter value that would be returned by a read on the fd.  This is
      useful on architectures that allow user programs to read the hardware
      counters, such as PowerPC.
      
      The mmap will only succeed if the counter is a hardware counter
      monitoring the current process.
      
      On my quad 2.5GHz PowerPC 970MP machine, userspace can read a counter
      and translate it to the full 64-bit value in about 30ns using the
      mmapped page, compared to about 830ns for the read syscall on the
      counter, so this does give a significant performance improvement.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Orig-LKML-Reference: <20090323172417.297057964@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      37d81828
    • P
      perf_counter: remove the event config bitfields · f4a2deb4
      Peter Zijlstra 提交于
      Since the bitfields turned into a bit of a mess, remove them and rely on
      good old masks.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Orig-LKML-Reference: <20090323172417.059499915@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      f4a2deb4
    • P
      perf_counter: fix type/event_id layout on big-endian systems · 9aaa131a
      Paul Mackerras 提交于
      Impact: build fix for powerpc
      
      Commit db3a944aca35ae61 ("perf_counter: revamp syscall input ABI")
      expanded the hw_event.type field into a union of structs containing
      bitfields.  In particular it introduced a type field and a raw_type
      field, with the intention that the 1-bit raw_type field should
      overlay the most-significant bit of the 8-bit type field, and in fact
      perf_counter_alloc() now assumes that (or at least, assumes that
      raw_type doesn't overlay any of the bits that are 1 in the values of
      PERF_TYPE_{HARDWARE,SOFTWARE,TRACEPOINT}).
      
      Unfortunately this is not true on big-endian systems such as PowerPC,
      where bitfields are laid out from left to right, i.e. from most
      significant bit to least significant.  This means that setting
      hw_event.type = PERF_TYPE_SOFTWARE will set hw_event.raw_type to 1.
      
      This fixes it by making the layout depend on whether or not
      __BIG_ENDIAN_BITFIELD is defined.  It's a bit ugly, but that's what
      we get for using bitfields in a user/kernel ABI.
      
      Also, that commit didn't fix up some places in arch/powerpc/kernel/
      perf_counter.c where hw_event.raw and hw_event.event_id were used.
      This fixes them too.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      9aaa131a
    • P
      perf_counter: powerpc: clean up perc_counter_interrupt · db4fb5ac
      Paul Mackerras 提交于
      Impact: cleanup
      
      This updates the powerpc perf_counter_interrupt following on from the
      "perf_counter: unify irq output code" patch.  Since we now use the
      generic perf_counter_output code, which sets the perf_counter_pending
      flag directly, we no longer need the need_wakeup variable.
      
      This removes need_wakeup and makes perf_counter_interrupt use
      get_perf_counter_pending() instead.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Orig-LKML-Reference: <20090319194234.024464535@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      db4fb5ac
    • P
      perf_counter: unify irq output code · 0322cd6e
      Peter Zijlstra 提交于
      Impact: cleanup
      
      Having 3 slightly different copies of the same code around does nobody
      any good. First step in revamping the output format.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Orig-LKML-Reference: <20090319194233.929962222@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0322cd6e
    • P
      perf_counter: revamp syscall input ABI · b8e83514
      Peter Zijlstra 提交于
      Impact: modify ABI
      
      The hardware/software classification in hw_event->type became a little
      strained due to the addition of tracepoint tracing.
      
      Instead split up the field and provide a type field to explicitly specify
      the counter type, while using the event_id field to specify which event to
      use.
      
      Raw counters still work as before, only the raw config now goes into
      raw_event.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Orig-LKML-Reference: <20090319194233.836807573@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      b8e83514
    • P
      perf_counter: abstract wakeup flag setting in core to fix powerpc build · b6c5a71d
      Paul Mackerras 提交于
      Impact: build fix for powerpc
      
      Commit bd753921015e7905 ("perf_counter: software counter event
      infrastructure") introduced a use of TIF_PERF_COUNTERS into the core
      perfcounter code.  This breaks the build on powerpc because we use
      a flag in a per-cpu area to signal wakeups on powerpc rather than
      a thread_info flag, because the thread_info flags have to be
      manipulated with atomic operations and are thus slower than per-cpu
      flags.
      
      This fixes the by changing the core to use an abstracted
      set_perf_counter_pending() function, which is defined on x86 to set
      the TIF_PERF_COUNTERS flag and on powerpc to set the per-cpu flag
      (paca->perf_counter_pending).  It changes the previous powerpc
      definition of set_perf_counter_pending to not take an argument and
      adds a clear_perf_counter_pending, so as to simplify the definition
      on x86.
      
      On x86, set_perf_counter_pending() is defined as a macro.  Defining
      it as a static inline in arch/x86/include/asm/perf_counters.h causes
      compile failures because <asm/perf_counters.h> gets included early in
      <linux/sched.h>, and the definitions of set_tsk_thread_flag etc. are
      therefore not available in <asm/perf_counters.h>.  (On powerpc this
      problem is avoided by defining set_perf_counter_pending etc. in
      <asm/hw_irq.h>.)
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      b6c5a71d
  2. 03 4月, 2009 2 次提交
  3. 02 4月, 2009 2 次提交
  4. 31 3月, 2009 1 次提交
    • A
      proc 2/2: remove struct proc_dir_entry::owner · 99b76233
      Alexey Dobriyan 提交于
      Setting ->owner as done currently (pde->owner = THIS_MODULE) is racy
      as correctly noted at bug #12454. Someone can lookup entry with NULL
      ->owner, thus not pinning enything, and release it later resulting
      in module refcount underflow.
      
      We can keep ->owner and supply it at registration time like ->proc_fops
      and ->data.
      
      But this leaves ->owner as easy-manipulative field (just one C assignment)
      and somebody will forget to unpin previous/pin current module when
      switching ->owner. ->proc_fops is declared as "const" which should give
      some thoughts.
      
      ->read_proc/->write_proc were just fixed to not require ->owner for
      protection.
      
      rmmod'ed directories will be empty and return "." and ".." -- no harm.
      And directories with tricky enough readdir and lookup shouldn't be modular.
      We definitely don't want such modular code.
      
      Removing ->owner will also make PDE smaller.
      
      So, let's nuke it.
      
      Kudos to Jeff Layton for reminding about this, let's say, oversight.
      
      http://bugzilla.kernel.org/show_bug.cgi?id=12454Signed-off-by: NAlexey Dobriyan <adobriyan@gmail.com>
      99b76233
  5. 27 3月, 2009 2 次提交
    • B
      powerpc: Fix bugs introduced by sysfs changes · ec78c8ac
      Benjamin Herrenschmidt 提交于
      Rusty's patch to change our sysfs access to various registers
      to use smp_call_function_single() introduced a whole bunch of
      warnings. This fixes them. This version also fixes an actual
      bug in here where it did mtspr instead of mfspr when reading
      the files
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      ec78c8ac
    • J
      powerpc: Sanitize stack pointer in signal handling code · efbda860
      Josh Boyer 提交于
      On powerpc64 machines running 32-bit userspace, we can get garbage bits in the
      stack pointer passed into the kernel.  Most places handle this correctly, but
      the signal handling code uses the passed value directly for allocating signal
      stack frames.
      
      This fixes the issue by introducing a get_clean_sp function that returns a
      sanitized stack pointer.  For 32-bit tasks on a 64-bit kernel, the stack
      pointer is masked correctly.  In all other cases, the stack pointer is simply
      returned.
      
      Additionally, we pass an 'is_32' parameter to get_sigframe now in order to
      get the properly sanitized stack.  The callers are know to be 32 or 64-bit
      statically.
      Signed-off-by: NJosh Boyer <jwboyer@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      efbda860
  6. 25 3月, 2009 1 次提交
  7. 24 3月, 2009 9 次提交
  8. 23 3月, 2009 1 次提交
  9. 21 3月, 2009 1 次提交
  10. 17 3月, 2009 1 次提交
  11. 11 3月, 2009 9 次提交
  12. 10 3月, 2009 1 次提交
    • T
      linker script: define __per_cpu_load on all SMP capable archs · 19390c4d
      Tejun Heo 提交于
      Impact: __per_cpu_load available on all SMP capable archs
      
      Percpu now requires three symbols to be defined - __per_cpu_load,
      __per_cpu_start and __per_cpu_end.  There were three archs which
      didn't have it.  Update them as follows.
      
      * powerpc: can use generic PERCPU() macro.  Compile tested for
        powerpc32, compile/boot tested for powerpc64.
      
      * ia64: can use generic PERCPU_VADDR() macro.  __phys_per_cpu_start is
        identical to __per_cpu_load.  Compile tested and symbol table looks
        identical after the change except for the additional __per_cpu_load.
      
      * arm: added explicit __per_cpu_load definition.  Currently uses
        unified .init output section so can't use the generic macro.  Dunno
        whether the unified .init ouput section is required by arch
        peculiarity so I left it alone.  Please break it up and use PERCPU()
        if possible.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Pat Gefre <pfg@sgi.com>
      Cc: Russell King <rmk@arm.linux.org.uk>
      19390c4d
  13. 09 3月, 2009 1 次提交
  14. 06 3月, 2009 1 次提交