1. 07 12月, 2011 2 次提交
    • A
      oprofile, s390: Add event interface to the System z hardware sampling module · dd3c4670
      Andreas Krebbel 提交于
      With this patch the OProfile Basic Mode Sampling support for System z
      is enhanced with a counter file system.  That way hardware sampling
      can be configured using the user space tools with only little
      modifications.
      
      With the patch by default new cpu_types (s390/z10, s390/z196) are
      returned in order to indicate that we are running a CPU which provides
      the hardware sampling facility.  Existing user space tools will
      complain about an unknown cpu type. In order to be compatible with
      existing user space tools the `cpu_type' module parameter has been
      added.  Setting the parameter to `timer' will force the module to
      return `timer' as cpu_type.  The module will still try to use hardware
      sampling if available and the hwsampling virtual filesystem will be
      also be available for configuration.  So this has a different effect
      than using the generic oprofile module parameter `timer=1'.
      
      If the basic mode sampling is enabled on the machine and the
      cpu_type=timer parameter is not used the kernel module will provide
      the following virtual filesystem:
      
      /dev/oprofile/0/enabled
      /dev/oprofile/0/event
      /dev/oprofile/0/count
      /dev/oprofile/0/unit_mask
      /dev/oprofile/0/kernel
      /dev/oprofile/0/user
      
      In the counter file system only the values of 'enabled', 'count',
      'kernel', and 'user' are evaluated by the kernel module. Everything
      else must contain fixed values.
      
      The 'event' value only supports a single event - HWSAMPLING with value
      0.
      
      The 'count' value specifies the hardware sampling rate as it is passed
      to the CPU measurement facility.
      
      The 'kernel' and 'user' flags can now be used to filter for samples
      when using hardware sampling.
      
      Additionally also the following file will be created:
      /dev/oprofile/timer/enabled
      
      This will always be the inverted value of /dev/oprofile/0/enabled. 0
      is not accepted without hardware sampling.
      Signed-off-by: NAndreas Krebbel <krebbel@linux.vnet.ibm.com>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      dd3c4670
    • R
      oprofile: Fix oprofile_timer_exit() breakage · f8c85203
      Robert Richter 提交于
      Removing remainings of oprofile_timer_exit() completly.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      f8c85203
  2. 08 11月, 2011 1 次提交
  3. 04 11月, 2011 5 次提交
    • R
      oprofile, x86: Reimplement nmi timer mode using perf event · dcfce4a0
      Robert Richter 提交于
      The legacy x86 nmi watchdog code was removed with the implementation
      of the perf based nmi watchdog. This broke Oprofile's nmi timer
      mode. To run nmi timer mode we relied on a continuous ticking nmi
      source which the nmi watchdog provided. The nmi tick was no longer
      available and current watchdog can not be used anymore since it runs
      with very long periods in the range of seconds. This patch
      reimplements the nmi timer mode using a perf counter nmi source.
      
      V2:
      * removing pr_info()
      * fix undefined reference to `__udivdi3' for 32 bit build
      * fix section mismatch of .cpuinit.data:nmi_timer_cpu_nb
      * removed nmi timer setup in arch/x86
      * implemented function stubs for op_nmi_init/exit()
      * made code more readable in oprofile_init()
      
      V3:
      * fix architectural initialization in oprofile_init()
      * fix CONFIG_OPROFILE_NMI_TIMER dependencies
      Acked-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      dcfce4a0
    • R
      oprofile: Remove exit function for timer mode · 75c43a20
      Robert Richter 提交于
      Remove exit functions by moving init/exit code to oprofile's setup/
      shutdown functions. Doing so the oprofile module exit code will be
      easier and less error-prone.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      75c43a20
    • R
      oprofile, x86: Add kernel parameter oprofile.cpu_type=timer · 159a80b2
      Robert Richter 提交于
      We need this to better test x86 NMI timer mode. Otherwise it is very
      hard to setup systems with NMI timer enabled, but we have this e.g. in
      virtual machine environments.
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      159a80b2
    • R
      oprofile, x86: Fix crash when unloading module (nmi timer mode) · 97f7f818
      Robert Richter 提交于
      If oprofile uses the nmi timer interrupt there is a crash while
      unloading the module. The bug can be triggered with oprofile build as
      module and kernel parameter nolapic set. This patch fixes this.
      
      oprofile: using NMI timer interrupt.
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
      PGD 42dbca067 PUD 41da6a067 PMD 0
      Oops: 0002 [#1] PREEMPT SMP
      CPU 5
      Modules linked in: oprofile(-) [last unloaded: oprofile]
      
      Pid: 2518, comm: modprobe Not tainted 3.1.0-rc7-00019-gb2fb49d #19 Advanced Micro Device Anaheim/Anaheim
      RIP: 0010:[<ffffffff8123c226>]  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
      RSP: 0018:ffff88041ef71e98  EFLAGS: 00010296
      RAX: 0000000000000000 RBX: ffffffffa0017100 RCX: dead000000200200
      RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8178c620
      RBP: ffff88041ef71ea8 R08: 0000000000000001 R09: 0000000000000082
      R10: 0000000000000000 R11: ffff88041ef71de8 R12: 0000000000000080
      R13: fffffffffffffff5 R14: 0000000000000001 R15: 0000000000610210
      FS:  00007fc902f20700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000008 CR3: 000000041cdb6000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process modprobe (pid: 2518, threadinfo ffff88041ef70000, task ffff88041d348040)
      Stack:
       ffff88041ef71eb8 ffffffffa0017790 ffff88041ef71eb8 ffffffffa0013532
       ffff88041ef71ec8 ffffffffa00132d6 ffff88041ef71ed8 ffffffffa00159b2
       ffff88041ef71f78 ffffffff81073115 656c69666f72706f 0000000000610200
      Call Trace:
       [<ffffffffa0013532>] op_nmi_exit+0x15/0x17 [oprofile]
       [<ffffffffa00132d6>] oprofile_arch_exit+0xe/0x10 [oprofile]
       [<ffffffffa00159b2>] oprofile_exit+0x1e/0x20 [oprofile]
       [<ffffffff81073115>] sys_delete_module+0x1c3/0x22f
       [<ffffffff811bf09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff8148070b>] system_call_fastpath+0x16/0x1b
      Code: 20 c6 78 81 e8 c5 cc 23 00 48 8b 13 48 8b 43 08 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de 48 c7 c7 20 c6 78 81
       89 42 08 48 89 10 48 89 33 48 89 4b 08 e8 a6 c0 23 00 5a 5b
      RIP  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
       RSP <ffff88041ef71e98>
      CR2: 0000000000000008
      ---[ end trace 43a541a52956b7b0 ]---
      
      CC: stable@kernel.org # 2.6.37+
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      97f7f818
    • R
      oprofile: Fix crash when unloading module (hr timer mode) · 87121ca5
      Robert Richter 提交于
      Oprofile may crash in a KVM guest while unlaoding modules. This
      happens if oprofile_arch_init() fails and oprofile switches to the hr
      timer mode as a fallback. In this case oprofile_arch_exit() is called,
      but it never was initialized properly which causes the crash. This
      patch fixes this.
      
      oprofile: using timer interrupt.
      BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
      IP: [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
      PGD 41da3f067 PUD 41d80e067 PMD 0
      Oops: 0002 [#1] PREEMPT SMP
      CPU 5
      Modules linked in: oprofile(-)
      
      Pid: 2382, comm: modprobe Not tainted 3.1.0-rc7-00018-g709a39d #18 Advanced Micro Device Anaheim/Anaheim
      RIP: 0010:[<ffffffff8123c226>]  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
      RSP: 0018:ffff88041de1de98  EFLAGS: 00010296
      RAX: 0000000000000000 RBX: ffffffffa00060e0 RCX: dead000000200200
      RDX: 0000000000000000 RSI: dead000000100100 RDI: ffffffff8178c620
      RBP: ffff88041de1dea8 R08: 0000000000000001 R09: 0000000000000082
      R10: 0000000000000000 R11: ffff88041de1dde8 R12: 0000000000000080
      R13: fffffffffffffff5 R14: 0000000000000001 R15: 0000000000610210
      FS:  00007f9ae5bef700(0000) GS:ffff88042fd40000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 0000000000000008 CR3: 000000041ca44000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process modprobe (pid: 2382, threadinfo ffff88041de1c000, task ffff88042db6d040)
      Stack:
       ffff88041de1deb8 ffffffffa0006770 ffff88041de1deb8 ffffffffa000251e
       ffff88041de1dec8 ffffffffa00022c2 ffff88041de1ded8 ffffffffa0004993
       ffff88041de1df78 ffffffff81073115 656c69666f72706f 0000000000610200
      Call Trace:
       [<ffffffffa000251e>] op_nmi_exit+0x15/0x17 [oprofile]
       [<ffffffffa00022c2>] oprofile_arch_exit+0xe/0x10 [oprofile]
       [<ffffffffa0004993>] oprofile_exit+0x13/0x15 [oprofile]
       [<ffffffff81073115>] sys_delete_module+0x1c3/0x22f
       [<ffffffff811bf09e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
       [<ffffffff8148070b>] system_call_fastpath+0x16/0x1b
      Code: 20 c6 78 81 e8 c5 cc 23 00 48 8b 13 48 8b 43 08 48 be 00 01 10 00 00 00 ad de 48 b9 00 02 20 00 00 00 ad de 48 c7 c7 20 c6 78 81
       89 42 08 48 89 10 48 89 33 48 89 4b 08 e8 a6 c0 23 00 5a 5b
      RIP  [<ffffffff8123c226>] unregister_syscore_ops+0x41/0x58
       RSP <ffff88041de1de98>
      CR2: 0000000000000008
      ---[ end trace 06d4e95b6aa3b437 ]---
      
      CC: stable@kernel.org # 2.6.37+
      Signed-off-by: NRobert Richter <robert.richter@amd.com>
      87121ca5
  4. 10 10月, 2011 4 次提交
    • D
      x86, nmi: Wire up NMI handlers to new routines · 9c48f1c6
      Don Zickus 提交于
      Just convert all the files that have an nmi handler to the new routines.
      Most of it is straight forward conversion.  A couple of places needed some
      tweaking like kgdb which separates the debug notifier from the nmi handler
      and mce removes a call to notify_die.
      
      [Thanks to Ying for finding out the history behind that mce call
      
      https://lkml.org/lkml/2010/5/27/114
      
      And Boris responding that he would like to remove that call because of it
      
      https://lkml.org/lkml/2011/9/21/163]
      
      The things that get converted are the registeration/unregistration routines
      and the nmi handler itself has its args changed along with code removal
      to check which list it is on (most are on one NMI list except for kgdb
      which has both an NMI routine and an NMI Unknown routine).
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Acked-by: NCorey Minyard <minyard@acm.org>
      Cc: Jason Wessel <jason.wessel@windriver.com>
      Cc: Andi Kleen <ak@linux.intel.com>
      Cc: Robert Richter <robert.richter@amd.com>
      Cc: Huang Ying <ying.huang@intel.com>
      Cc: Corey Minyard <minyard@acm.org>
      Cc: Jack Steiner <steiner@sgi.com>
      Link: http://lkml.kernel.org/r/1317409584-23662-4-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      9c48f1c6
    • D
      x86, nmi: Create new NMI handler routines · c9126b2e
      Don Zickus 提交于
      The NMI handlers used to rely on the notifier infrastructure.  This worked
      great until we wanted to support handling multiple events better.
      
      One of the key ideas to the nmi handling is to process _all_ the handlers for
      each NMI.  The reason behind this switch is because NMIs are edge triggered.
      If enough NMIs are triggered, then they could be lost because the cpu can
      only latch at most one NMI (besides the one currently being processed).
      
      In order to deal with this we have decided to process all the NMI handlers
      for each NMI.  This allows the handlers to determine if they recieved an
      event or not (the ones that can not determine this will be left to fend
      for themselves on the unknown NMI list).
      
      As a result of this change it is now possible to have an extra NMI that
      was destined to be received for an already processed event.  Because the
      event was processed in the previous NMI, this NMI gets dropped and becomes
      an 'unknown' NMI.  This of course will cause printks that scare people.
      
      However, we prefer to have extra NMIs as opposed to losing NMIs and as such
      are have developed a basic mechanism to catch most of them.  That will be
      a later patch.
      
      To accomplish this idea, I unhooked the nmi handlers from the notifier
      routines and created a new mechanism loosely based on doIRQ.  The reason
      for this is the notifier routines have a couple of shortcomings.  One we
      could't guarantee all future NMI handlers used NOTIFY_OK instead of
      NOTIFY_STOP.  Second, we couldn't keep track of the number of events being
      handled in each routine (most only handle one, perf can handle more than one).
      Third, I wanted to eventually display which nmi handlers are registered in
      the system in /proc/interrupts to help see who is generating NMIs.
      
      The patch below just implements the new infrastructure but doesn't wire it up
      yet (that is the next patch).  Its design is based on doIRQ structs and the
      atomic notifier routines.  So the rcu stuff in the patch isn't entirely untested
      (as the notifier routines have soaked it) but it should be double checked in
      case I copied the code wrong.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317409584-23662-3-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      c9126b2e
    • D
      x86, nmi: Split out nmi from traps.c · 1d48922c
      Don Zickus 提交于
      The nmi stuff is changing a lot and adding more functionality.  Split it
      out from the traps.c file so it doesn't continue to pollute that file.
      
      This makes it easier to find and expand all the future nmi related work.
      
      No real functional changes here.
      Signed-off-by: NDon Zickus <dzickus@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317409584-23662-2-git-send-email-dzickus@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      1d48922c
    • G
      perf, intel: Use GO/HO bits in perf-ctr · 144d31e6
      Gleb Natapov 提交于
      Intel does not have guest/host-only bit in perf counters like AMD
      does.  To support GO/HO bits KVM needs to switch EVENTSELn values
      (or PERF_GLOBAL_CTRL if available) at a guest entry. If a counter is
      configured to count only in a guest mode it stays disabled in a host,
      but VMX is configured to switch it to enabled value during guest entry.
      
      This patch adds GO/HO tracking to Intel perf code and provides interface
      for KVM to get a list of MSRs that need to be switched on a guest entry.
      
      Only cpus with architectural PMU (v1 or later) are supported with this
      patch.  To my knowledge there is not p6 models with VMX but without
      architectural PMU and p4 with VMX are rare and the interface is general
      enough to support them if need arise.
      Signed-off-by: NGleb Natapov <gleb@redhat.com>
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Link: http://lkml.kernel.org/r/1317816084-18026-7-git-send-email-gleb@redhat.comSigned-off-by: NIngo Molnar <mingo@elte.hu>
      144d31e6
  5. 06 10月, 2011 4 次提交
  6. 05 10月, 2011 12 次提交
  7. 04 10月, 2011 12 次提交