1. 23 4月, 2009 1 次提交
  2. 08 4月, 2009 1 次提交
  3. 04 4月, 2009 1 次提交
  4. 27 3月, 2009 2 次提交
  5. 17 3月, 2009 1 次提交
    • A
      acpi: fix of pmtimer overflow that make Cx states time incorrect · ff69f2bb
      alex.shi 提交于
      We found Cx states time abnormal in our some of machines which have 16
      LCPUs, the C0 take too many time while system is really idle when kernel
      enabled tickless and highres.  powertop output is below:
      
           PowerTOP version 1.9       (C) 2007 Intel Corporation
      
      Cn                Avg residency       P-states (frequencies)
      C0 (cpu running)        (40.5%)         2.53 Ghz     0.0%
      C1                0.0ms ( 0.0%)         2.53 Ghz     0.0%
      C2              128.8ms (59.5%)         2.40 Ghz     0.0%
                                              1.60 Ghz   100.0%
      
      Wakeups-from-idle per second :  4.7     interval: 20.0s
      no ACPI power usage estimate available
      
      Top causes for wakeups:
        41.4% ( 24.9)       <interrupt> : extra timer interrupt
        20.2% ( 12.2)     <kernel core> : usb_hcd_poll_rh_status
      (rh_timer_func)
      
      After tacking detailed for this issue, Yakui and I find it is due to 24
      bit PM timer overflows when some of cpu sleep more than 4 seconds.  With
      tickless kernel, the CPU want to sleep as much as possible when system
      idle.  But the Cx sleep time are recorded by pmtimer which length is
      determined by BIOS.  The current Cx time was gotten in the following
      function from driver/acpi/processor_idle.c:
      
      static inline u32 ticks_elapsed(u32 t1, u32 t2)
      {
             if (t2 >= t1)
                     return (t2 - t1);
             else if (!(acpi_gbl_FADT.flags & ACPI_FADT_32BIT_TIMER))
                     return (((0x00FFFFFF - t1) + t2) & 0x00FFFFFF);
             else
                     return ((0xFFFFFFFF - t1) + t2);
      }
      
      If pmtimer is 24 bits and it take 5 seconds from t1 to t2, in above
      function, just about 1 seconds ticks was recorded.  So the Cx time will be
      reduced about 4 seconds.  and this is why we see above powertop output.
      
      To resolve this problem, Yakui and I use ktime_get() to record the Cx
      states time instead of PM timer as the following patch.  the patch was
      tested with i386/x86_64 modes on several platforms.
      Acked-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Tested-by: NAlex Shi <alex.shi@intel.com>
      Signed-off-by: NAlex Shi <alex.shi@intel.com>
      Signed-off-by: NYakui.zhao <yakui.zhao@intel.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      ff69f2bb
  6. 07 2月, 2009 1 次提交
    • L
      ACPI: delete CPU_IDLE=n code · 9fdd54f2
      Len Brown 提交于
      CPU_IDLE=y has been default for ACPI=y since Nov-2007,
      and has shipped in many distributions since then.
      
      Here we delete the CPU_IDLE=n ACPI idle code, since
      nobody should be using it, and we don't want to
      maintain two versions.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      9fdd54f2
  7. 29 1月, 2009 2 次提交
    • L
      ACPI: remove BM_RLD access from idle entry path · 31878dd8
      Len Brown 提交于
      It is true that BM_RLD needs to be set to enable
      bus master activity to wake an older chipset (eg PIIX4) from C3.
      
      This is contrary to the erroneous wording the ACPI 2.0, 3.0
      specifications that suggests that BM_RLD is an indicator
      rather than a control bit.
      
      ACPI 1.0's correct wording should be restored in ACPI 4.0:
      http://www.acpica.org/bugzilla/show_bug.cgi?id=689
      
      But the kernel should not have to clear BM_RLD
      when entering a non C3-type state just to set
      it again when entering a C3-type C-state.
      
      We should be able to set BM_RLD at boot time
      and leave it alone -- removing the overhead of
      accessing this IO register from the idle entry path.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      31878dd8
    • L
      ACPI: remove locking from PM1x_STS register reads · a2b7b01c
      Len Brown 提交于
      PM1a_STS and PM1b_STS are twins that get OR'd together
      on reads, and all writes are repeated to both.
      
      The fields in PM1x_STS are single bits only,
      there are no multi-bit fields.
      
      So it is not necessary to lock PM1x_STS reads against
      writes because it is impossible to read an intermediate
      value of a single bit.  It will either be 0 or 1,
      even if a write is in progress during the read.
      Reads are asynchronous to writes no matter if a lock
      is used or not.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      a2b7b01c
  8. 07 1月, 2009 1 次提交
    • R
      remove linux/hardirq.h from asm-generic/local.h · ba84be23
      Russell King 提交于
      While looking at reducing the amount of architecture namespace pollution
      in the generic kernel, I found that asm/irq.h is included in the vast
      majority of compilations on ARM (around 650 files.)
      
      Since asm/irq.h includes a sub-architecture include file on ARM, this
      causes a negative impact on the ccache's ability to re-use the build
      results from other sub-architectures, so we have a desire to reduce the
      dependencies on asm/irq.h.
      
      It turns out that a major cause of this is the needless include of
      linux/hardirq.h into asm-generic/local.h.  The patch below removes this
      include, resulting in some 250 to 300 files (around half) of the kernel
      then omitting asm/irq.h.
      
      My test builds still succeed, provided two ARM files are fixed
      (arch/arm/kernel/traps.c and arch/arm/mm/fault.c) - so there may be
      negative impacts for this on other architectures.
      
      Note that x86 does not include asm/irq.h nor linux/hardirq.h in its
      asm/local.h, so this patch can be viewed as bringing the generic version
      into line with the x86 version.
      
      [kosaki.motohiro@jp.fujitsu.com: add #include <linux/irqflags.h> to acpi/processor_idle.c]
      [adobriyan@gmail.com: fix sparc64]
      Signed-off-by: NRussell King <rmk+kernel@arm.linux.org.uk>
      Signed-off-by: NKOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Alexey Dobriyan <adobriyan@gmail.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      ba84be23
  9. 17 12月, 2008 1 次提交
    • V
      x86: support always running TSC on Intel CPUs · 40fb1715
      Venki Pallipadi 提交于
      Impact: reward non-stop TSCs with good TSC-based clocksources, etc.
      
      Add support for CPUID_0x80000007_Bit8 on Intel CPUs as well. This bit means
      that the TSC is invariant with C/P/T states and always runs at constant
      frequency.
      
      With Intel CPUs, we have 3 classes
      * CPUs where TSC runs at constant rate and does not stop n C-states
      * CPUs where TSC runs at constant rate, but will stop in deep C-states
      * CPUs where TSC rate will vary based on P/T-states and TSC will stop in deep
        C-states.
      
      To cover these 3, one feature bit (CONSTANT_TSC) is not enough. So, add a
      second bit (NONSTOP_TSC). CONSTANT_TSC indicates that the TSC runs at
      constant frequency irrespective of P/T-states, and NONSTOP_TSC indicates
      that TSC does not stop in deep C-states.
      
      CPUID_0x8000000_Bit8 indicates both these feature bit can be set.
      We still have CONSTANT_TSC _set_ and NONSTOP_TSC _not_set_ on some older Intel
      CPUs, based on model checks. We can use TSC on such CPUs for time, as long as
      those CPUs do not support/enter deep C-states.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      40fb1715
  10. 08 11月, 2008 1 次提交
  11. 17 10月, 2008 1 次提交
  12. 15 8月, 2008 1 次提交
  13. 28 7月, 2008 1 次提交
  14. 26 7月, 2008 1 次提交
  15. 17 7月, 2008 3 次提交
  16. 26 6月, 2008 1 次提交
  17. 12 6月, 2008 1 次提交
    • V
      cpuidle acpi driver: fix oops on AC<->DC · dcb84f33
      Venkatesh Pallipadi 提交于
      cpuidle and acpi driver interaction bug with the way cpuidle_register_driver()
      is called. Due to this bug, there will be oops on
      AC<->DC on some systems, where they support C-states in one DC and not in AC.
      
      The current code does
      ON BOOT:
      	Look at CST and other C-state info to see whether more than C1 is
      	supported. If it is, then acpi processor_idle does a
      	cpuidle_register_driver() call, which internally enables the device.
      
      ON CST change notification (AC<->DC) and on suspend-resume:
      	acpi driver temporarily disables device, updates the device with
      	any new C-states, and reenables the device.
      
      The problem is is on boot, there are no C2, C3 states supported and we skip
      the register. Later on AC<->DC, we may get a CST notification and we try
      to reevaluate CST and enabled the device, without actually registering it.
      This causes breakage as we try to create /sys fs sub directory, without the
      parent directory which is created at register time.
      
      Thanks to Sanjeev for reporting the problem here.
      http://bugzilla.kernel.org/show_bug.cgi?id=10394Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      dcb84f33
  18. 01 5月, 2008 1 次提交
    • V
      ACPI: Fix acpi_processor_idle and idle= boot parameters interaction · 36a91358
      Venkatesh Pallipadi 提交于
      acpi_processor_idle and "idle=" boot parameter interaction is broken.
      The problem is that, at boot time acpi driver is checking for "idle=" boot
      option and not registering the acpi idle handler. But, when there is a CST
      changed callback (typically when switching AC <-> battery or suspend-resume)
      there are no checks for boot_option_idle_override and acpi idle handler tries
      to get installed with nasty side effects.
      
      With CPU_IDLE configured this issue causes results in a nasty oops on CST
      change callback and without CPU_IDLE there is no oops, but boot option
      of "idle=" gets ignored and acpi idle handler gets installed.
      
      Change the behavior to not do anything in acpi idle handler when there is a
      "idle=" boot option.
      
      Note that the problem is only there when "idle=" boot option is used.
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      36a91358
  19. 29 4月, 2008 1 次提交
  20. 27 4月, 2008 1 次提交
    • P
      fix idle (arch, acpi and apm) and lockdep · 7f424a8b
      Peter Zijlstra 提交于
      OK, so 25-mm1 gave a lockdep error which made me look into this.
      
      The first thing that I noticed was the horrible mess; the second thing I
      saw was hacks like: 71e93d15
      
      The problem is that arch idle routines are somewhat inconsitent with
      their IRQ state handling and instead of fixing _that_, we go paper over
      the problem.
      
      So the thing I've tried to do is set a standard for idle routines and
      fix them all up to adhere to that. So the rules are:
      
        idle routines are entered with IRQs disabled
        idle routines will exit with IRQs enabled
      
      Nearly all already did this in one form or another.
      
      Merge the 32 and 64 bit bits so they no longer have different bugs.
      
      As for the actual lockdep warning; __sti_mwait() did a plainly un-annotated
      irq-enable.
      Signed-off-by: NPeter Zijlstra <a.p.zijlstra@chello.nl>
      Tested-by: NBob Copeland <me@bobcopeland.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      7f424a8b
  21. 25 4月, 2008 1 次提交
  22. 26 3月, 2008 2 次提交
  23. 14 3月, 2008 1 次提交
    • V
      ACPI: lockdep warning on boot, 2.6.25-rc5 · 71e93d15
      Venki Pallipadi 提交于
      This avoids the harmless WARNING by lockdep in acpi_processor_idle().
      
      The reason for WARNING is because at the depth of idle handling code,
      some of the idle handlers disable interrupts, some times, while returning from
      the idle handler. After return, acpi_processor_idle and few other routines
      in the file did an unconditional local_irq_enable(). With LOCKDEP, enabling
      irq when it is already enabled generates the below WARNING.
      
      > > [    0.593038] ------------[ cut here ]------------
      > > [    0.593267] WARNING: at kernel/lockdep.c:2035 trace_hardirqs_on+0xa0/0x115()
      > > [    0.593596] Modules linked in:
      > > [    0.593756] Pid: 0, comm: swapper Not tainted 2.6.25-rc5 #8
      > > [    0.594017]
      > > [    0.594017] Call Trace:
      > > [    0.594216]  [<ffffffff80231663>] warn_on_slowpath+0x58/0x6b
      > > [    0.594495]  [<ffffffff80495966>] ? _spin_unlock_irqrestore+0x38/0x47
      > > [    0.594809]  [<ffffffff80329a86>] ? acpi_os_release_lock+0x9/0xb
      > > [    0.595103]  [<ffffffff80337840>] ? acpi_set_register+0x161/0x173
      > > [    0.595401]  [<ffffffff8034c8d4>] ? acpi_processor_idle+0x1de/0x546
      > > [    0.595706]  [<ffffffff8020a23b>] ? default_idle+0x0/0x73
      > > [    0.595970]  [<ffffffff8024fc0e>] trace_hardirqs_on+0xa0/0x115
      > > [    0.596049]  [<ffffffff8034c6f6>] ? acpi_processor_idle+0x0/0x546
      > > [    0.596346]  [<ffffffff8034c8d4>] acpi_processor_idle+0x1de/0x546
      > > [    0.596642]  [<ffffffff8020a23b>] ? default_idle+0x0/0x73
      > > [    0.596912]  [<ffffffff8034c6f6>] ? acpi_processor_idle+0x0/0x546
      > > [    0.597209]  [<ffffffff8020a23b>] ? default_idle+0x0/0x73
      > > [    0.597472]  [<ffffffff8020a355>] cpu_idle+0xa7/0xd1
      > > [    0.597717]  [<ffffffff80485fa1>] rest_init+0x55/0x57
      > > [    0.597957]  [<ffffffff8062fb49>] start_kernel+0x29d/0x2a8
      > > [    0.598215]  [<ffffffff8062f1da>] _sinittext+0x1da/0x1e1
      > > [    0.598464]
      > > [    0.598546] ---[ end trace 778e504de7e3b1e3 ]---
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NLen Brown <len.brown@intel.com>
      71e93d15
  24. 20 2月, 2008 1 次提交
  25. 14 2月, 2008 2 次提交
  26. 07 2月, 2008 4 次提交
  27. 06 2月, 2008 1 次提交
  28. 30 1月, 2008 2 次提交
    • A
      x86: don't disable TSC in any C states on AMD Fam10h · ddb25f9a
      Andi Kleen 提交于
      The ACPI code currently disables TSC use in any C2 and C3
      states. But the AMD Fam10h BKDG documents that the TSC
      will never stop in any C states when the CONSTANT_TSC bit is
      set. Make this disabling conditional on CONSTANT_TSC
      not set on AMD.
      
      I actually think this is true on Intel too for C2 states
      on CPUs with p-state invariant TSC, but this needs
      further discussions with Len to really confirm :-)
      
      So far it is only enabled on AMD.
      
      Cc: lenb@kernel.org
      Signed-off-by: NAndi Kleen <ak@suse.de>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      ddb25f9a
    • V
      x86: voluntary leave_mm before entering ACPI C3 · bde6f5f5
      Venki Pallipadi 提交于
      Aviod TLB flush IPIs during C3 states by voluntary leave_mm()
      before entering C3.
      
      The performance impact of TLB flush on C3 should not be significant with
      respect to C3 wakeup latency. Also, CPUs tend to flush TLB in hardware while in
      C3 anyways.
      
      On a 8 logical CPU system, running make -j2, the number of tlbflush IPIs goes
      down from 40 per second to ~ 0. Total number of interrupts during the run
      of this workload was ~1200 per second, which makes it ~3% savings in wakeups.
      
      There was no measurable performance or power impact however.
      
      [ akpm@linux-foundation.org: symbol export fixes. ]
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NIngo Molnar <mingo@elte.hu>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      bde6f5f5
  29. 08 1月, 2008 1 次提交
  30. 14 12月, 2007 1 次提交
    • L
      cpuidle: default processor.latency_factor=2 · 25de5718
      Len Brown 提交于
      More aggressively request deep C-states.
      
      Note that the job of the OS is to minimize latency
      impact to expected break events such as interrupts.
      
      It is not the job of the OS to try to calculate if
      the C-state will reach energy break-even.
      The platform doesn't give the OS enough information
      for it to make that calculation.  Thus, it is up
      to the platform to decide if it is worth it to
      go as deep as the OS requested it to, or if it
      should internally demote to a more shallow C-state.
      
      But the converse is not true.  The platform can not
      promote into a deeper C-state than the OS requested
      else it may violate latency constraints.  So it is
      important that the OS be aggressive in giving the
      platform permission to enter deep C-states.
      Signed-off-by: NLen Brown <len.brown@intel.com>
      25de5718