1. 10 5月, 2010 2 次提交
  2. 01 4月, 2010 2 次提交
  3. 30 3月, 2010 1 次提交
    • T
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo 提交于
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Guess-its-ok-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  4. 08 3月, 2010 1 次提交
  5. 13 1月, 2010 1 次提交
  6. 25 11月, 2009 3 次提交
    • T
      [ACPI/CPUFREQ] Introduce bios_limit per cpu cpufreq sysfs interface · e2f74f35
      Thomas Renninger 提交于
      This interface is mainly intended (and implemented) for ACPI _PPC BIOS
      frequency limitations, but other cpufreq drivers can also use it for
      similar use-cases.
      
      Why is this needed:
      
      Currently it's not obvious why cpufreq got limited.
      People see cpufreq/scaling_max_freq reduced, but this could have
      happened by:
        - any userspace prog writing to scaling_max_freq
        - thermal limitations
        - hardware (_PPC in ACPI case) limitiations
      
      Therefore export bios_limit (in kHz) to:
        - Point the user that it's the BIOS (broken or intended) which limits
          frequency
        - Export it as a sysfs interface for userspace progs.
          While this was a rarely used feature on laptops, there will appear
          more and more server implemenations providing "Green IT" features like
          allowing the service processor to limit the frequency. People want
          to know about HW/BIOS frequency limitations.
      
      All ACPI P-state driven cpufreq drivers are covered with this patch:
        - powernow-k8
        - powernow-k7
        - acpi-cpufreq
      
      Tested with a patched DSDT which limits the first two cores (_PPC returns 1)
      via _PPC, exposed by bios_limit:
      # echo 2200000 >cpu2/cpufreq/scaling_max_freq
      # cat cpu*/cpufreq/scaling_max_freq
      2600000
      2600000
      2200000
      2200000
      # #scaling_max_freq shows general user/thermal/BIOS limitations
      
      # cat cpu*/cpufreq/bios_limit
      2600000
      2600000
      2800000
      2800000
      # #bios_limit only shows the HW/BIOS limitation
      
      CC: Pallipadi Venkatesh <venkatesh.pallipadi@intel.com>
      CC: Len Brown <lenb@kernel.org>
      CC: davej@codemonkey.org.uk
      CC: linux@dominikbrodowski.net
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NDave Jones <davej@redhat.com>
      e2f74f35
    • A
      [CPUFREQ] make internal cpufreq_add_dev_* static · cf3289d0
      Alex Chiang 提交于
      No need to export these symbols; make them static.
      
      	cpufreq_add_dev_policy
      	cpufreq_add_dev_symlink
      	cpufreq_add_dev_interface
      Signed-off-by: NAlex Chiang <achiang@hp.com>
      Signed-off-by: NDave Jones <davej@redhat.com>
      cf3289d0
    • T
      [CPUFREQ] Use global sysfs cpufreq structure for conservative governor tunings · 49b015ce
      Thomas Renninger 提交于
      Same adustments that have been added to the ondemand recently.
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NDave Jones <davej@redhat.com>
      49b015ce
  7. 18 11月, 2009 3 次提交
  8. 29 10月, 2009 1 次提交
    • T
      percpu: make percpu symbols in cpufreq unique · f1625066
      Tejun Heo 提交于
      This patch updates percpu related symbols in cpufreq such that percpu
      symbols are unique and don't clash with local symbols.  This serves
      two purposes of decreasing the possibility of global percpu symbol
      collision and allowing dropping per_cpu__ prefix from percpu symbols.
      
      * drivers/cpufreq/cpufreq.c: s/policy_cpu/cpufreq_policy_cpu/
      * drivers/cpufreq/freq_table.c: s/show_table/cpufreq_show_table/
      * arch/x86/kernel/cpu/cpufreq/acpi-cpufreq.c: s/drv_data/acfreq_data/
        					      s/old_perf/acfreq_old_perf/
      
      Partly based on Rusty Russell's "alloc_percpu: rename percpu vars
      which cause name clashes" patch.
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Cc: Rusty Russell <rusty@rustcorp.com.au>
      f1625066
  9. 02 9月, 2009 10 次提交
    • M
      [CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call (second call site) · 395913d0
      Mathieu Desnoyers 提交于
      remove rwsem lock from CPUFREQ_GOV_STOP call (second call site)
      
      commit	42a06f21
      
      Missed a call site for CPUFREQ_GOV_STOP to remove the rwlock taken around the
      teardown. To make a long story short, the rwlock write-lock causes a circular
      dependency with cancel_delayed_work_sync(), because the timer handler takes the
      read lock.
      
      Note that all callers to __cpufreq_set_policy are taking the rwsem. All sysfs
      callers (writers) hold the write rwsem at the earliest sysfs calling stage.
      
      However, the rwlock write-lock is not needed upon governor stop.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Acked-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      CC: rjw@sisk.pl
      CC: mingo@elte.hu
      CC: Shaohua Li <shaohua.li@intel.com>
      CC: Pekka Enberg <penberg@cs.helsinki.fi>
      CC: Dave Young <hidave.darkstar@gmail.com>
      CC: "Rafael J. Wysocki" <rjw@sisk.pl>
      CC: Rusty Russell <rusty@rustcorp.com.au>
      CC: trenn@suse.de
      CC: sven.wegener@stealer.net
      CC: cpufreq@vger.kernel.org
      Signed-off-by: NDave Jones <davej@redhat.com>
      395913d0
    • T
      [CPUFREQ] ondemand - Use global sysfs dir for tuning settings · 0e625ac1
      Thomas Renninger 提交于
      Ondemand has only global variables for userspace tunings via sysfs.
      But they were exposed per CPU which wrongly implies to the user that
      his settings are applied per cpu. Also locking sysfs against concurrent
      access won't be necessary anymore after deprecation time.
      
      This means the ondemand config dir is moved:
      /sys/devices/system/cpu/cpu*/cpufreq/ondemand ->
           /sys/devices/system/cpu/cpufreq/ondemand
      
      The old files will still exist, but reading or writing to them will
      result in one (printk_once) deprecation msg to syslog per file.
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NDave Jones <davej@redhat.com>
      0e625ac1
    • T
      [CPUFREQ] Introduce global, not per core: /sys/devices/system/cpu/cpufreq · 8aa84ad8
      Thomas Renninger 提交于
      Currently everything in the cpufreq layer is per core based.
      This does not reflect reality, for example ondemand on conservative
      governors have global sysfs variables.
      
      Introduce a global cpufreq directory and add the kobject to the governor
      struct, so that governors can easily access it.
      The directory is initialized in the cpufreq_core_init initcall and thus will
      always be created if cpufreq is compiled in, even if no cpufreq driver is
      active later.
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NDave Jones <davej@redhat.com>
      8aa84ad8
    • T
      [CPUFREQ] Bail out of cpufreq_add_dev if the link for a managed CPU got created · 4bfa042c
      Thomas Renninger 提交于
      Doing:
      echo 0 >cpu1/online
      echo 1 >cpu1/online
      
      on a managed CPU will result in:
      Jul 22 15:15:37 linux kernel: [   80.013864] WARNING: at fs/sysfs/dir.c:487 sysfs_add_one+0xcf/0xe6()
      Jul 22 15:15:37 linux kernel: [   80.013866] Hardware name: To Be Filled By O.E.M.
      Jul 22 15:15:37 linux kernel: [   80.013868] sysfs: cannot create duplicate filename '/devices/system/cpu/cpu1/cpufreq'
      Jul 22 15:15:37 linux kernel: [   80.013870] Modules linked in: powernow_k8
      Jul 22 15:15:37 linux kernel: [   80.013874] Pid: 5750, comm: bash Not tainted 2.6.31-rc2 #40
      Jul 22 15:15:37 linux kernel: [   80.013876] Call Trace:
      Jul 22 15:15:37 linux kernel: [   80.013879]  [<ffffffff8112ebda>] ? sysfs_add_one+0xcf/0xe6
      Jul 22 15:15:37 linux kernel: [   80.013884]  [<ffffffff81041926>] warn_slowpath_common+0x77/0xa4
      Jul 22 15:15:37 linux kernel: [   80.013888]  [<ffffffff810419a0>] warn_slowpath_fmt+0x3c/0x3e
      Jul 22 15:15:37 linux kernel: [   80.013891]  [<ffffffff8112ebda>] sysfs_add_one+0xcf/0xe6
      Jul 22 15:15:37 linux kernel: [   80.013894]  [<ffffffff8112f213>] create_dir+0x58/0x87
      Jul 22 15:15:37 linux kernel: [   80.013898]  [<ffffffff8112f27a>] sysfs_create_dir+0x38/0x4f
      Jul 22 15:15:37 linux kernel: [   80.013902]  [<ffffffff811ffb8a>] kobject_add_internal+0x11f/0x1de
      Jul 22 15:15:37 linux kernel: [   80.013905]  [<ffffffff811ffd21>] kobject_add_varg+0x41/0x4e
      Jul 22 15:15:37 linux kernel: [   80.013908]  [<ffffffff811ffd7a>] kobject_init_and_add+0x4c/0x57
      Jul 22 15:15:37 linux kernel: [   80.013913]  [<ffffffff810667bc>] ? mark_lock+0x22/0x228
      Jul 22 15:15:37 linux kernel: [   80.013918]  [<ffffffff813e8a3b>] cpufreq_add_dev_interface+0x40/0x1e4
      ...
      
      This bug slipped in by git commit:
      150b06f7f223cfd0f808737a5243cceca8ea47fa
      
      When splitting up cpufreq_add_dev, the whole cpufreq_add_dev function
      is not left anymore, only cpufreq_add_dev_policy.
      This patch should reconstruct the identical functionality again as it
      was before the split.
      
      CC: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NDave Jones <davej@redhat.com>
      4bfa042c
    • D
      [CPUFREQ] Factor out policy setting from cpufreq_add_dev · ecf7e461
      Dave Jones 提交于
      Signed-off-by: NDave Jones <davej@redhat.com>
      ecf7e461
    • D
      909a694e
    • D
      19d6f7ec
    • D
      [CPUFREQ] cleanup up -ENOMEM handling in cpufreq_add_dev · 059019a3
      Dave Jones 提交于
      Signed-off-by: NDave Jones <davej@redhat.com>
      059019a3
    • D
      [CPUFREQ] Reduce scope of cpu_sys_dev in cpufreq_add_dev · 54e6fe16
      Dave Jones 提交于
      Signed-off-by: NDave Jones <davej@redhat.com>
      54e6fe16
    • D
      [CPUFREQ] Re-enable cpufreq suspend and resume code · ce6c3997
      Dominik Brodowski 提交于
      Commit 4bc5d341 is broken and causes regressions:
      
      (1) cpufreq_driver->resume() and ->suspend() were only called on
      __powerpc__, but you could set them on all architectures. In fact,
      ->resume() was defined and used before the PPC-related commit
      42d4dc3f complained about in 4bc5d341.
      
      (2) Therfore, the resume functions in acpi_cpufreq and speedstep-smi
      would never be called.
      
      (3) This means speedstep-smi would be unusuable after suspend or resume.
      
      The _real_ problem was calling cpufreq_driver->get() with interrupts
      off, but it re-enabling interrupts on some platforms. Why is ->get()
      necessary?
      
      Some systems like to change the CPU frequency behind our
      back, especially during BIOS-intensive operations like suspend or
      resume. If such systems also use a CPU frequency-dependant timing loop,
      delays might be off by large factors. Therefore, we need to ascertain
      as soon as possible that the CPU frequency is indeed at the speed we
      think it is. You can do this two ways: either setting it anew, or trying
      to get it. The latter is what was done, the former also has the same IRQ
      issue.
      
      So, let's try something different: defer the checking to after interrupts
      are re-enabled, by calling cpufreq_update_policy() (via schedule_work()).
      Timings may be off until this later stage, so let's watch out for
      resume regressions caused by the deferred handling of frequency changes
      behind the kernel's back.
      Signed-off-by: NDominik Brodowski <linux@dominikbrodowski.net>
      Signed-off-by: NDave Jones <davej@redhat.com>
      ce6c3997
  10. 05 8月, 2009 4 次提交
    • D
      [CPUFREQ] Make cpufreq suspend code conditional on powerpc. · 4bc5d341
      Dave Jones 提交于
      The suspend code runs with interrupts disabled, and the powerpc workaround we
      do in the cpufreq suspend hook calls the drivers ->get method.
      
      powernow-k8's ->get does an smp_call_function_single
      which needs interrupts enabled
      
      cpufreq's suspend/resume code was added in 42d4dc3f to work around
      a hardware problem on ppc powerbooks.  If we make all this code
      conditional on powerpc, we avoid the issue above.
      Signed-off-by: NDave Jones <davej@redhat.com>
      4bc5d341
    • T
      [CPUFREQ] Fix a kobject reference bug related to managed CPUs · d5194dec
      Thomas Renninger 提交于
      The first offline/online cycle is successful, the second not.
      Doing:
      echo 0 >cpu1/online
      echo 1 >cpu1/online
      echo 0 >cpu1/online
      
      The last command will trigger:
      Jul 22 14:39:50 linux kernel: [  593.210125] ------------[ cut here ]------------
      Jul 22 14:39:50 linux kernel: [  593.210139] WARNING: at lib/kref.c:43 kref_get+0x23/0x2b()
      Jul 22 14:39:50 linux kernel: [  593.210144] Hardware name: To Be Filled By O.E.M.
      Jul 22 14:39:50 linux kernel: [  593.210148] Modules linked in: powernow_k8
      Jul 22 14:39:50 linux kernel: [  593.210158] Pid: 378, comm: kondemand/2 Tainted: G        W  2.6.31-rc2 #38
      Jul 22 14:39:50 linux kernel: [  593.210163] Call Trace:
      Jul 22 14:39:50 linux kernel: [  593.210171]  [<ffffffff812008e8>] ? kref_get+0x23/0x2b
      Jul 22 14:39:50 linux kernel: [  593.210181]  [<ffffffff81041926>] warn_slowpath_common+0x77/0xa4
      Jul 22 14:39:50 linux kernel: [  593.210190]  [<ffffffff81041962>] warn_slowpath_null+0xf/0x11
      Jul 22 14:39:50 linux kernel: [  593.210198]  [<ffffffff812008e8>] kref_get+0x23/0x2b
      Jul 22 14:39:50 linux kernel: [  593.210206]  [<ffffffff811ffa19>] kobject_get+0x1a/0x22
      Jul 22 14:39:50 linux kernel: [  593.210214]  [<ffffffff813e815d>] cpufreq_cpu_get+0x8a/0xcb
      Jul 22 14:39:50 linux kernel: [  593.210222]  [<ffffffff813e87d1>] __cpufreq_driver_getavg+0x1d/0x67
      Jul 22 14:39:50 linux kernel: [  593.210231]  [<ffffffff813ea18f>] do_dbs_timer+0x158/0x27f
      Jul 22 14:39:50 linux kernel: [  593.210240]  [<ffffffff810529ea>] worker_thread+0x200/0x313
      ...
      
      The output continues on every do_dbs_timer ondemand freq checking poll.
      This regression was introduced by git commit:
      3f4a782b
      
      The policy is released when the cpufreq device is removed in:
      __cpufreq_remove_dev():
      	/* if this isn't the CPU which is the parent of the kobj, we
      	 * only need to unlink, put and exit
      	 */
      
      Not creating the symlink is not sever at all.
      As long as:
      sysfs_remove_link(&sys_dev->kobj, "cpufreq");
      handles it gracefully that the symlink did not exist.
      Possibly no error should be returned at all, because ondemand
      governor would still provide the same functionality.
      Userspace in userspace gov case might be confused if the link
      is missing.
      
      Resolves http://bugzilla.kernel.org/show_bug.cgi?id=13903
      
      CC: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      CC: Venkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NThomas Renninger <trenn@suse.de>
      Signed-off-by: NDave Jones <davej@redhat.com>
      d5194dec
    • P
      [CPUFREQ] Do not set policy for offline cpus · 42c74b84
      Prarit Bhargava 提交于
      Suspend/Resume fails on multi socket, multi core systems because the cpufreq
      code erroneously sets the per_cpu policy_cpu value when a logical cpu is
      offline.
      
      This most notably results in missing sysfs files that are used to set the
      cpu frequencies of the various cpus.
      Signed-off-by: NPrarit Bhargava <prarit@redhat.com>
      Signed-off-by: NDave Jones <davej@redhat.com>
      42c74b84
    • P
      [CPUFREQ] Fix NULL pointer dereference regression in conservative governor · 26d204af
      Pallipadi, Venkatesh 提交于
      Commit ee88415c
      introduced this regression when it removed enable bit in cpu_dbs_info_s.
      That added a possibility of dbs_cpufreq_notifier getting called for a
      CPU that is not yet managed by conservative governor. That will happen
      as the transition notifier is set as soon as one CPU switches to
      conservative governor and other CPUs can get a NULL pointer dereference
      without the enable bit check. Add the enable bit back again.
      Reported-by: NLermytte Christophe <Christophe.Lermytte@thomson.net>
      Signed-off-by: NVenkatesh Pallipadi <venkatesh.pallipadi@intel.com>
      Signed-off-by: NDave Jones <davej@redhat.com>
      26d204af
  11. 09 7月, 2009 1 次提交
  12. 07 7月, 2009 4 次提交
  13. 24 6月, 2009 1 次提交
    • T
      percpu: clean up percpu variable definitions · 245b2e70
      Tejun Heo 提交于
      Percpu variable definition is about to be updated such that all percpu
      symbols including the static ones must be unique.  Update percpu
      variable definitions accordingly.
      
      * as,cfq: rename ioc_count uniquely
      
      * cpufreq: rename cpu_dbs_info uniquely
      
      * xen: move nesting_count out of xen_evtchn_do_upcall() and rename it
      
      * mm: move ratelimits out of balance_dirty_pages_ratelimited_nr() and
        rename it
      
      * ipv4,6: rename cookie_scratch uniquely
      
      * x86 perf_counter: rename prev_left to pmc_prev_left, irq_entry to
        pmc_irq_entry and nmi_entry to pmc_nmi_entry
      
      * perf_counter: rename disable_count to perf_disable_count
      
      * ftrace: rename test_event_disable to ftrace_test_event_disable
      
      * kmemleak: rename test_pointer to kmemleak_test_pointer
      
      * mce: rename next_interval to mce_next_interval
      
      [ Impact: percpu usage cleanups, no duplicate static percpu var names ]
      Signed-off-by: NTejun Heo <tj@kernel.org>
      Reviewed-by: NChristoph Lameter <cl@linux-foundation.org>
      Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
      Cc: Jens Axboe <jens.axboe@oracle.com>
      Cc: Dave Jones <davej@redhat.com>
      Cc: Jeremy Fitzhardinge <jeremy@xensource.com>
      Cc: linux-mm <linux-mm@kvack.org>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Steven Rostedt <srostedt@redhat.com>
      Cc: Li Zefan <lizf@cn.fujitsu.com>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Andi Kleen <andi@firstfloor.org>
      245b2e70
  14. 15 6月, 2009 2 次提交
  15. 09 6月, 2009 1 次提交
  16. 27 5月, 2009 3 次提交
    • M
      [CPUFREQ] fix timer teardown in ondemand governor · b14893a6
      Mathieu Desnoyers 提交于
      * Rafael J. Wysocki (rjw@sisk.pl) wrote:
      > This message has been generated automatically as a part of a report
      > of regressions introduced between 2.6.28 and 2.6.29.
      >
      > The following bug entry is on the current list of known regressions
      > introduced between 2.6.28 and 2.6.29.  Please verify if it still should
      > be listed and let me know (either way).
      >
      >
      > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13186
      > Subject		: cpufreq timer teardown problem
      > Submitter	: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      > Date		: 2009-04-23 14:00 (24 days old)
      > References	: http://marc.info/?l=linux-kernel&m=124049523515036&w=4
      > Handled-By	: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      > Patch		: http://patchwork.kernel.org/patch/19754/
      > 		  http://patchwork.kernel.org/patch/19753/
      >
      
      (updated changelog)
      
      cpufreq fix timer teardown in ondemand governor
      
      The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should
      use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the
      workqueue handler to exit.
      
      The ondemand governor does not seem to be affected because the
      "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns
      immediately without rescheduling the work. The conservative governor in
      2.6.30-rc has the same check as the ondemand governor, which makes things
      usually run smoothly. However, if the governor is quickly stopped and then
      started, this could lead to the following race :
      
      dbs_enable could be reenabled and multiple do_dbs_timer handlers would run.
      This is why a synchronized teardown is required.
      
      The following patch applies to, at least, 2.6.28.x, 2.6.29.1, 2.6.30-rc2.
      
      Depends on patch
      cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: gregkh@suse.de
      CC: stable@kernel.org
      CC: cpufreq@vger.kernel.org
      CC: Ingo Molnar <mingo@elte.hu>
      CC: rjw@sisk.pl
      CC: Ben Slusky <sluskyb@paranoiacs.org>
      Signed-off-by: NDave Jones <davej@redhat.com>
      b14893a6
    • M
      [CPUFREQ] fix timer teardown in conservative governor · b253d2b2
      Mathieu Desnoyers 提交于
      * Rafael J. Wysocki (rjw@sisk.pl) wrote:
      > This message has been generated automatically as a part of a report
      > of regressions introduced between 2.6.28 and 2.6.29.
      >
      > The following bug entry is on the current list of known regressions
      > introduced between 2.6.28 and 2.6.29.  Please verify if it still should
      > be listed and let me know (either way).
      >
      >
      > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13186
      > Subject		: cpufreq timer teardown problem
      > Submitter	: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      > Date		: 2009-04-23 14:00 (24 days old)
      > References	: http://marc.info/?l=linux-kernel&m=124049523515036&w=4
      > Handled-By	: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      > Patch		: http://patchwork.kernel.org/patch/19754/
      > 		  http://patchwork.kernel.org/patch/19753/
      >
      
      (re-send with updated changelog)
      
      cpufreq fix timer teardown in conservative governor
      
      The problem is that dbs_timer_exit() uses cancel_delayed_work() when it should
      use cancel_delayed_work_sync(). cancel_delayed_work() does not wait for the
      workqueue handler to exit.
      
      The ondemand governor does not seem to be affected because the
      "if (!dbs_info->enable)" check at the beginning of the workqueue handler returns
      immediately without rescheduling the work. The conservative governor in
      2.6.30-rc has the same check as the ondemand governor, which makes things
      usually run smoothly. However, if the governor is quickly stopped and then
      started, this could lead to the following race :
      
      dbs_enable could be reenabled and multiple do_dbs_timer handlers would run.
      This is why a synchronized teardown is required.
      
      Depends on patch
      cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call
      
      The following patch applies to 2.6.30-rc2. Stable kernels have a similar
      issue which should also be fixed, but the code changed between 2.6.29
      and 2.6.30, so this patch only applies to 2.6.30-rc.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      CC: Andrew Morton <akpm@linux-foundation.org>
      CC: gregkh@suse.de
      CC: stable@kernel.org
      CC: cpufreq@vger.kernel.org
      CC: Ingo Molnar <mingo@elte.hu>
      CC: rjw@sisk.pl
      CC: Ben Slusky <sluskyb@paranoiacs.org>
      Signed-off-by: NDave Jones <davej@redhat.com>
      b253d2b2
    • M
      [CPUFREQ] remove rwsem lock from CPUFREQ_GOV_STOP call · 42a06f21
      Mathieu Desnoyers 提交于
      * Rafael J. Wysocki (rjw@sisk.pl) wrote:
      > This message has been generated automatically as a part of a report
      > of regressions introduced between 2.6.28 and 2.6.29.
      >
      > The following bug entry is on the current list of known regressions
      > introduced between 2.6.28 and 2.6.29.  Please verify if it still should
      > be listed and let me know (either way).
      >
      >
      > Bug-Entry	: http://bugzilla.kernel.org/show_bug.cgi?id=13186
      > Subject		: cpufreq timer teardown problem
      > Submitter	: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      > Date		: 2009-04-23 14:00 (24 days old)
      > References	: http://marc.info/?l=linux-kernel&m=124049523515036&w=4
      > Handled-By	: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      > Patch		: http://patchwork.kernel.org/patch/19754/
      > 		  http://patchwork.kernel.org/patch/19753/
      
      The patches linked above depend on the following patch to remove
      circular locking dependency :
      
      cpufreq: remove rwsem lock from CPUFREQ_GOV_STOP call
      
      (the following issue was faced when using cancel_delayed_work_sync() in the
      timer teardown (which fixes a race).
      
      * KOSAKI Motohiro (kosaki.motohiro@jp.fujitsu.com) wrote:
      > Hi
      >
      > my box output following warnings.
      > it seems regression by commit 7ccc7608b836e58fbacf65ee4f8eefa288e86fac.
      >
      > A: work -> do_dbs_timer()  -> cpu_policy_rwsem
      > B: store() -> cpu_policy_rwsem -> cpufreq_governor_dbs() -> work
      >
      >
      
      Hrm, I think it must be due to my attempt to fix the timer teardown race
      in ondemand governor mixed with new locking behavior in 2.6.30-rc.
      
      The rwlock seems to be taken around the whole call to
      cpufreq_governor_dbs(), when it should be only taken around accesses to
      the locked data, and especially *not* around the call to
      dbs_timer_exit().
      
      Reverting my fix attempt would put the teardown race back in place
      (replacing the cancel_delayed_work_sync by cancel_delayed_work).
      Instead, a proper fix would imply modifying this critical section :
      
      cpufreq.c: __cpufreq_remove_dev()
      ...
              if (cpufreq_driver->target)
                      __cpufreq_governor(data, CPUFREQ_GOV_STOP);
      
              unlock_policy_rwsem_write(cpu);
      
      To make sure the __cpufreq_governor() callback is not called with rwsem
      held. This would allow execution of cancel_delayed_work_sync() without
      being nested within the rwsem.
      
      Applies on top of the 2.6.30-rc5 tree.
      
      Required to remove circular dep in teardown of both conservative and
      ondemande governors so they can use cancel_delayed_work_sync().
      CPUFREQ_GOV_STOP does not modify the policy, therefore this locking seemed
      unneeded.
      Signed-off-by: NMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      CC: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Greg KH <greg@kroah.com>
      CC: Ingo Molnar <mingo@elte.hu>
      CC: "Rafael J. Wysocki" <rjw@sisk.pl>
      CC: Ben Slusky <sluskyb@paranoiacs.org>
      CC: Chris Wright <chrisw@sous-sol.org>
      CC: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NDave Jones <davej@redhat.com>
      42a06f21