1. 03 6月, 2010 1 次提交
    • P
      perf: Fix crash in swevents · c6df8d5a
      Peter Zijlstra 提交于
      Frederic reported that because swevents handling doesn't disable IRQs
      anymore, we can get a recursion of perf_adjust_period(), once from
      overflow handling and once from the tick.
      
      If both call ->disable, we get a double hlist_del_rcu() and trigger
      a LIST_POISON2 dereference.
      
      Since we don't actually need to stop/start a swevent to re-programm
      the hardware (lack of hardware to program), simply nop out these
      callbacks for the swevent pmu.
      Reported-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1275557609.27810.35218.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      c6df8d5a
  2. 31 5月, 2010 4 次提交
    • F
      perf_events: Fix unincremented buffer base on partial copy · 74048f89
      Frederic Weisbecker 提交于
      If a sample size crosses to the next page boundary, the copy
      will be made in more than one step. However we forget to advance
      the source offset for the next copy, leading to unexpected double
      copies that completely mess up the traces.
      
      This fixes various kinds of bad traces that have irrelevant
      data inside, as an example:
      
      	geany-4979  [001]  5758.077775: sched_switch: prev_comm=! prev_pid=121
      		prev_prio=0 prev_state=S|D|Z|X|x ==> next_comm= next_pid=7497072
      		next_prio=0
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1274988898-5639-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      74048f89
    • S
      perf_events: Fix event scheduling issues introduced by transactional API · 90151c35
      Stephane Eranian 提交于
      The transactional API patch between the generic and model-specific
      code introduced several important bugs with event scheduling, at
      least on X86. If you had pinned events, e.g., watchdog,  and were
      over-committing the PMU, you would get bogus counts. The bug was
      showing up on Intel CPU because events would move around more
      often that on AMD. But the problem also existed on AMD, though
      harder to expose.
      
      The issues were:
      
       - group_sched_in() was missing a cancel_txn() in the error path
      
       - cpuc->n_added was not properly maintained, leading to missing
         actions in hw_perf_enable(), i.e., n_running being 0. You cannot
         update n_added until you know the transaction has succeeded. In
         case of failed transaction n_added was not adjusted back.
      
       - in case of failed transactions, event_sched_out() was called
         and eventually invoked x86_disable_event() to touch the HW reg.
         But with transactions, on X86, event_sched_in() does not touch
         HW registers, it simply collects events into a list. Thus, you
         could end up calling x86_disable_event() on a counter which
         did not correspond to the current event when idx != -1.
      
      The patch modifies the generic and X86 code to avoid all those problems.
      
      First, we keep track of the number of events added last. In case the
      transaction fails, we substract them from n_added. This approach is
      necessary (as opposed to delaying updates to n_added) because not all
      event updates use the transaction API, e.g., single events.
      
      Second, we encapsulate the event_sched_in() and event_sched_out() in
      group_sched_in() inside the transaction. That makes the operations
      symmetrical and you can also detect that you are inside a transaction
      and skip the HW reg access by checking cpuc->group_flag.
      
      With this patch, you can now overcommit the PMU even with pinned
      system-wide events present and still get valid counts.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1274796225.5882.1389.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      90151c35
    • P
      perf_events: Fix races in group composition · 8a49542c
      Peter Zijlstra 提交于
      Group siblings don't pin each-other or the parent, so when we destroy
      events we must make sure to clean up all cross referencing pointers.
      
      In particular, for destruction of a group leader we must be able to
      find all its siblings and remove their reference to it.
      
      This means that detaching an event from its context must not detach it
      from the group, otherwise we can end up failing to clear all pointers.
      
      Solve this by clearly separating the attachment to a context and
      attachment to a group, and keep the group composed until we destroy
      the events.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      8a49542c
    • P
      perf_events: Fix races and clean up perf_event and perf_mmap_data interaction · ac9721f3
      Peter Zijlstra 提交于
      In order to move toward separate buffer objects, rework the whole
      perf_mmap_data construct to be a more self-sufficient entity, one
      with its own lifetime rules.
      
      This greatly sanitizes the whole output redirection code, which
      was riddled with bugs and races.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: <stable@kernel.org>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      ac9721f3
  3. 28 5月, 2010 1 次提交
  4. 21 5月, 2010 8 次提交
    • P
      perf: Optimize perf_tp_event_match() · 580d607c
      Peter Zijlstra 提交于
      Since we know tracepoints come from kernel context,
      avoid conditionals that try and establish that very
      fact.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100521090710.904944001@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      580d607c
    • P
      perf: Remove more code from the fastpath · a94ffaaf
      Peter Zijlstra 提交于
      Sanity checks cost instructions.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100521090710.852926930@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a94ffaaf
    • P
      perf: Optimize the !vmalloc backed buffer · 3cafa9fb
      Peter Zijlstra 提交于
      Reduce code and data by using the knowledge that for
      !PERF_USE_VMALLOC data_order is always 0.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100521090710.795019386@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      3cafa9fb
    • P
      perf: Optimize perf_output_copy() · 5d967a8b
      Peter Zijlstra 提交于
      Reduce the clutter in perf_output_copy() by keeping
      an interator in perf_output_handle.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100521090710.742809176@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      5d967a8b
    • P
      perf: Fix wakeup storm for RO mmap()s · adb8e118
      Peter Zijlstra 提交于
      RO mmap()s don't update the tail pointer, so
      comparing against it for determining the written data
      size doesn't really do any good.
      
      Keep track of when we last did a wakeup, and compare
      against that.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100521090710.684479310@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      adb8e118
    • P
      perf: Ensure that IOC_OUTPUT isn't used to create multi-writer buffers · 0f139300
      Peter Zijlstra 提交于
      Since we want to ensure buffers only have a single
      writer, we must avoid creating one with multiple.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100521090710.528215873@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      0f139300
    • P
      perf, trace: Optimize tracepoints by using per-tracepoint-per-cpu hlist to track events · 1c024eca
      Peter Zijlstra 提交于
      Avoid the swevent hash-table by using per-tracepoint
      hlists.
      
      Also, avoid conditionals on the fast path by ordering
      with probe unregister so that we should never get on
      the callback path without the data being there.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      LKML-Reference: <20100521090710.473188012@chello.nl>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      1c024eca
    • F
      perf: Fix forgotten preempt_enable by nested writers · acd35a46
      Frederic Weisbecker 提交于
      A writer that gets a reference to the buffer handle disables
      preemption. When we put that reference, we check if we are
      the outer most writer and if not, we simply return and defer
      the head update to the outer most writer. The problem here
      is that preemption is only reenabled by the outer most, that
      produces preemption count imbalance for every nested writer
      that exit.
      
      So just don't forget to always re-enable preemption when we
      put the buffer reference, whoever we are.
      
      Fixes lots of sleeping in atomic warnings, visible with lock
      events recording.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Stephane Eranian <eranian@google.com>
      Cc: Robert Richter <robert.richter@amd.com>
      acd35a46
  5. 20 5月, 2010 1 次提交
    • F
      perf: Comply with new rcu checks API · 49f135ed
      Frederic Weisbecker 提交于
      The software events hlist doesn't fully comply with the new
      rcu checks api.
      
      We need to consider three different sides that access the hlist:
      
      - the hlist allocation/release side. This side happens when an
        events is created or released, accesses to the hlist are
        serialized under the cpuctx mutex.
      
      - the events insertion/removal in the hlist. This side is always
        serialized against the above one. The hlist is always present
        during such operations. This side happens when a software event
        is scheduled in/out. The serialization that ensures the software
        event is really attached to the context is made under the
        ctx->lock.
      
      - events triggering. This is the read side, it can happen
        concurrently with any update side.
      
      This patch deals with them one by one and anticipates with the
      separate rcu mem space patches in preparation.
      
      This patch fixes various annoying rcu warnings.
      Reported-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      49f135ed
  6. 19 5月, 2010 7 次提交
  7. 11 5月, 2010 3 次提交
  8. 08 5月, 2010 1 次提交
    • P
      perf_event: Make software events work again · 6e85158c
      Paul Mackerras 提交于
      Commit 6bde9b6c ("perf: Add
      group scheduling transactional APIs") added code to allow a
      group to be scheduled in a single transaction.  However, it
      introduced a bug in handling events whose pmu does not implement
      transactions -- at the end of scheduling in the events in the
      group, in the non-transactional case the code now falls through
      to the group_error label, and proceeds to unschedule all the
      events in the group and return failure.
      
      This fixes it by returning 0 (success) in the non-transactional
      case.
      Signed-off-by: NPaul Mackerras <paulus@samba.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Lin Ming <ming.m.lin@intel.com>
      Cc: Frederic Weisbecker <fweisbec@gmail.com>
      Cc: eranian@gmail.com
      LKML-Reference: <20100508105800.GB10650@brick.ozlabs.ibm.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6e85158c
  9. 07 5月, 2010 3 次提交
    • L
      perf: Add group scheduling transactional APIs · 6bde9b6c
      Lin Ming 提交于
      Add group scheduling transactional APIs to struct pmu.
      These APIs will be implemented in arch code, based on Peter's idea as
      below.
      
      > the idea behind hw_perf_group_sched_in() is to not perform
      > schedulability tests on each event in the group, but to add the group
      > as a whole and then perform one test.
      >
      > Of course, when that test fails, you'll have to roll-back the whole
      > group again.
      >
      > So start_txn (or a better name) would simply toggle a flag in the pmu
      > implementation that will make pmu::enable() not perform the
      > schedulablilty test.
      >
      > Then commit_txn() will perform the schedulability test (so note the
      > method has to have a !void return value.
      >
      > This will allow us to use the regular
      > kernel/perf_event.c::group_sched_in() and all the rollback code.
      > Currently each hw_perf_group_sched_in() implementation duplicates all
      > the rolllback code (with various bugs).
      
      ->start_txn:
      Start group events scheduling transaction, set a flag to make
      pmu::enable() not perform the schedulability test, it will be performed
      at commit time.
      
      ->commit_txn:
      Commit group events scheduling transaction, perform the group
      schedulability as a whole
      
      ->cancel_txn:
      Stop group events scheduling transaction, clear the flag so
      pmu::enable() will perform the schedulability test.
      Reviewed-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Signed-off-by: NLin Ming <ming.m.lin@intel.com>
      Cc: David Miller <davem@davemloft.net>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1272002160.5707.60.camel@minggr.sh.intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      6bde9b6c
    • P
      perf: Annotate perf_event_read_group() vs perf_event_release_kernel() · a0507c84
      Peter Zijlstra 提交于
      Stephane reported a lockdep warning while using PERF_FORMAT_GROUP.
      
      The issue is that perf_event_read_group() takes faults while holding
      the ctx->mutex, while perf_event_release_kernel() can be called from
      munmap(). Which makes for an AB-BA deadlock.
      
      Except we can never establish the deadlock because we'll only ever
      call perf_event_release_kernel() after all file descriptors are dead
      so there is no concurrency possible.
      Reported-by: NStephane Eranian <eranian@google.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <new-submission>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      a0507c84
    • P
      perf: Fix exit() vs PERF_FORMAT_GROUP · 4fd38e45
      Peter Zijlstra 提交于
      Both Stephane and Corey reported that PERF_FORMAT_GROUP didn't work
      as expected if the task the counters were attached to quit before
      the read() call.
      
      The cause is that we unconditionally destroy the grouping when we
      remove counters from their context. Fix this by only doing this when
      we free the counter itself.
      Reported-by: NCorey Ashford <cjashfor@linux.vnet.ibm.com>
      Reported-by: NStephane Eranian <eranian@google.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      LKML-Reference: <1273160566.5605.404.camel@twins>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      4fd38e45
  10. 01 5月, 2010 1 次提交
  11. 19 4月, 2010 1 次提交
  12. 15 4月, 2010 3 次提交
    • F
      perf: Fix hlist related build error · 95476b64
      Frederic Weisbecker 提交于
      hlist helpers need to be available for all software events, not
      only trace events.
      
      Pull them out outside the ifdef CONFIG_EVENT_TRACING section.
      
      Fixes:
      	kernel/perf_event.c:4573: error: implicit declaration of function 'swevent_hlist_put'
      	kernel/perf_event.c:4614: error: implicit declaration of function 'swevent_hlist_get'
      	kernel/perf_event.c:5534: error: implicit declaration of function 'swevent_hlist_release
      Reported-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1271281338-23491-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      95476b64
    • F
      perf: Make clock software events consistent with general exclusion rules · df8290bf
      Frederic Weisbecker 提交于
      The cpu/task clock events implement their own version of exclusion
      on top of exclude_user and exclude_kernel.
      
      The result is that when the event triggered in the kernel but we
      have exclude_kernel set, we try to rewind using task_pt_regs.
      There are two side effects of this:
      
      - we call task_pt_regs even on kernel threads, which doesn't give
        us the desired result.
      - if the event occured in the kernel, we shouldn't rewind to the
        user context. We want to actually ignore the event.
      
      get_irq_regs() will always give us the right interrupted context, so
      use its result and submit it to perf_exclude_context() that knows
      when an event must be ignored.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      df8290bf
    • F
      perf: Store active software events in a hashlist · 76e1d904
      Frederic Weisbecker 提交于
      Each time a software event triggers, we need to walk through
      the entire list of events from the current cpu and task contexts
      to retrieve a running perf event that matches.
      We also need to check a matching perf event is actually counting.
      
      This walk is wasteful and makes the event fast path scaling
      down with a growing number of events running on the same
      contexts.
      
      To solve this, we store the running perf events in a hashlist to
      get an immediate access to them against their type:event_id when
      they trigger.
      
      v2: - Fix SWEVENT_HLIST_SIZE definition (and re-learn some basic
            maths along the way)
          - Only allocate hlist for online cpus, but keep track of the
            refcount on offline possible cpus too, so that we allocate it
            if needed when it becomes online.
          - Drop the kref use as it's not adapted to our tricks anymore.
      
      v3: - Fix bad refcount check (address instead of value). Thanks to
            Eric Dumazet who spotted this.
          - While exiting cpu, move the hlist release out of the IPI path
            to lock the hlist mutex sanely.
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      76e1d904
  13. 04 4月, 2010 1 次提交
  14. 03 4月, 2010 2 次提交
  15. 01 4月, 2010 1 次提交
    • F
      perf: Use hot regs with software sched switch/migrate events · e49a5bd3
      Frederic Weisbecker 提交于
      Scheduler's task migration events don't work because they always
      pass NULL regs perf_sw_event(). The event hence gets filtered
      in perf_swevent_add().
      
      Scheduler's context switches events use task_pt_regs() to get
      the context when the event occured which is a wrong thing to
      do as this won't give us the place in the kernel where we went
      to sleep but the place where we left userspace. The result is
      even more wrong if we switch from a kernel thread.
      
      Use the hot regs snapshot for both events as they belong to the
      non-interrupt/exception based events family. Unlike page faults
      or so that provide the regs matching the exact origin of the event,
      we need to save the current context.
      
      This makes the task migration event working and fix the context
      switch callchains and origin ip.
      
      Example: perf record -a -e cs
      
      Before:
      
          10.91%      ksoftirqd/0                  0  [k] 0000000000000000
                      |
                      --- (nil)
                          perf_callchain
                          perf_prepare_sample
                          __perf_event_overflow
                          perf_swevent_overflow
                          perf_swevent_add
                          perf_swevent_ctx_event
                          do_perf_sw_event
                          __perf_sw_event
                          perf_event_task_sched_out
                          schedule
                          run_ksoftirqd
                          kthread
                          kernel_thread_helper
      
      After:
      
          23.77%  hald-addon-stor  [kernel.kallsyms]  [k] schedule
                  |
                  --- schedule
                     |
                     |--60.00%-- schedule_timeout
                     |          wait_for_common
                     |          wait_for_completion
                     |          blk_execute_rq
                     |          scsi_execute
                     |          scsi_execute_req
                     |          sr_test_unit_ready
                     |          |
                     |          |--66.67%-- sr_media_change
                     |          |          media_changed
                     |          |          cdrom_media_changed
                     |          |          sr_block_media_changed
                     |          |          check_disk_change
                     |          |          cdrom_open
      
      v2: Always build perf_arch_fetch_caller_regs() now that software
      events need that too. They don't need it from modules, unlike trace
      events, so we keep the EXPORT_SYMBOL in trace_event_perf.c
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: David Miller <davem@davemloft.net>
      e49a5bd3
  16. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  17. 17 3月, 2010 1 次提交
    • F
      perf: Fix unexported generic perf_arch_fetch_caller_regs · dcd5c166
      Frederic Weisbecker 提交于
      perf_arch_fetch_caller_regs() is exported for the overriden x86
      version, but not for the generic weak version.
      
      As a general rule, weak functions should not have their symbol
      exported in the same file they are defined.
      
      So let's export it on trace_event_perf.c as it is used by trace
      events only.
      
      This fixes:
      
      	ERROR: ".perf_arch_fetch_caller_regs" [fs/xfs/xfs.ko] undefined!
      	ERROR: ".perf_arch_fetch_caller_regs" [arch/powerpc/platforms/cell/spufs/spufs.ko] undefined!
      
      -v2: And also only build it if trace events are enabled.
      -v3: Fix changelog mistake
      Reported-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NFrederic Weisbecker <fweisbec@gmail.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Xiao Guangrong <xiaoguangrong@cn.fujitsu.com>
      Cc: Paul Mackerras <paulus@samba.org>
      LKML-Reference: <1268697902-9518-1-git-send-regression-fweisbec@gmail.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      dcd5c166