1. 28 6月, 2016 1 次提交
    • R
      intel_pstate: Do not clear utilization update hooks on policy changes · 5ab666e0
      Rafael J. Wysocki 提交于
      intel_pstate_set_policy() is invoked by the cpufreq core during
      driver initialization, on changes of policy attributes (minimim and
      maximum frequency, for example) via sysfs and via CPU notifications
      from the platform firmware.  On some platforms the latter may occur
      relatively often.
      
      Commit bb6ab52f (intel_pstate: Do not set utilization update hook
      too early) made intel_pstate_set_policy() clear the CPU's utilization
      update hook before updating the policy attributes for it (and set the
      hook again after doind that), but that involves invoking
      synchronize_sched() and adds overhead to the CPU notifications
      mentioned above and to the sched-RCU handling in general.
      
      That extra overhead is arguably not necessary, because updating
      policy attributes when the CPU's utilization update hook is active
      should not lead to any adverse effects, so drop the clearing of
      the hook from intel_pstate_set_policy() and make it check if
      the hook has been set already when attempting to set it.
      
      Fixes: bb6ab52f (intel_pstate: Do not set utilization update hook too early)
      Reported-by: NJisheng Zhang <jszhang@marvell.com>
      Tested-by: NJisheng Zhang <jszhang@marvell.com>
      Tested-by: NDoug Smythies <dsmythies@telus.net>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      5ab666e0
  2. 24 6月, 2016 1 次提交
  3. 15 6月, 2016 1 次提交
  4. 08 6月, 2016 2 次提交
    • S
      cpufreq: intel_pstate: Fix ->set_policy() interface for no_turbo · 983e600e
      Srinivas Pandruvada 提交于
      When turbo is disabled, the ->set_policy() interface is broken.
      
      For example, when turbo is disabled and cpuinfo.max = 2900000 (full
      max turbo frequency), setting the limits results in frequency less
      than the requested one:
      Set 1000000 KHz results in 0700000 KHz
      Set 1500000 KHz results in 1100000 KHz
      Set 2000000 KHz results in  1500000 KHz
      
      This is because the limits->max_perf fraction is calculated using
      the max turbo frequency as the reference, but when the max P-State is
      capped in intel_pstate_get_min_max(), the reference is not the max
      turbo P-State. This results in reducing max P-State.
      
      One option is to always use max turbo as reference for calculating
      limits. But this will not be correct. By definition the intel_pstate
      sysfs limits, shows percentage of available performance. So when
      BIOS has disabled turbo, the available performance is max non turbo.
      So the max_perf_pct should still show 100%.
      Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      [ rjw : Subject & changelog, rewrite in fewer lines of code ]
      Cc: All applicable <stable@vger.kernel.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      983e600e
    • S
      cpufreq: intel_pstate: Fix code ordering in intel_pstate_set_policy() · 2c2c1af4
      Srinivas Pandruvada 提交于
      The limits->max_perf is rounded_up but immediately overwritten by
      another assignment to limits->max_perf.
      
      Move that operation to the correct location.
      
      While here also added a pr_debug() call in ->set_policy to aid in
      debugging.
      
      Fixes: 785ee278 (cpufreq: intel_pstate: Fix limits->max_perf rounding error)
      Signed-off-by: NSrinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
      [ rjw : Subject & changelog ]
      Cc: 4.4+ <stable@vger.kernel.org> # 4.4+
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      2c2c1af4
  5. 02 6月, 2016 1 次提交
  6. 30 5月, 2016 1 次提交
  7. 28 5月, 2016 1 次提交
    • A
      remove lots of IS_ERR_VALUE abuses · 287980e4
      Arnd Bergmann 提交于
      Most users of IS_ERR_VALUE() in the kernel are wrong, as they
      pass an 'int' into a function that takes an 'unsigned long'
      argument. This happens to work because the type is sign-extended
      on 64-bit architectures before it gets converted into an
      unsigned type.
      
      However, anything that passes an 'unsigned short' or 'unsigned int'
      argument into IS_ERR_VALUE() is guaranteed to be broken, as are
      8-bit integers and types that are wider than 'unsigned long'.
      
      Andrzej Hajda has already fixed a lot of the worst abusers that
      were causing actual bugs, but it would be nice to prevent any
      users that are not passing 'unsigned long' arguments.
      
      This patch changes all users of IS_ERR_VALUE() that I could find
      on 32-bit ARM randconfig builds and x86 allmodconfig. For the
      moment, this doesn't change the definition of IS_ERR_VALUE()
      because there are probably still architecture specific users
      elsewhere.
      
      Almost all the warnings I got are for files that are better off
      using 'if (err)' or 'if (err < 0)'.
      The only legitimate user I could find that we get a warning for
      is the (32-bit only) freescale fman driver, so I did not remove
      the IS_ERR_VALUE() there but changed the type to 'unsigned long'.
      For 9pfs, I just worked around one user whose calling conventions
      are so obscure that I did not dare change the behavior.
      
      I was using this definition for testing:
      
       #define IS_ERR_VALUE(x) ((unsigned long*)NULL == (typeof (x)*)NULL && \
             unlikely((unsigned long long)(x) >= (unsigned long long)(typeof(x))-MAX_ERRNO))
      
      which ends up making all 16-bit or wider types work correctly with
      the most plausible interpretation of what IS_ERR_VALUE() was supposed
      to return according to its users, but also causes a compile-time
      warning for any users that do not pass an 'unsigned long' argument.
      
      I suggested this approach earlier this year, but back then we ended
      up deciding to just fix the users that are obviously broken. After
      the initial warning that caused me to get involved in the discussion
      (fs/gfs2/dir.c) showed up again in the mainline kernel, Linus
      asked me to send the whole thing again.
      
      [ Updated the 9p parts as per Al Viro  - Linus ]
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andrzej Hajda <a.hajda@samsung.com>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Link: https://lkml.org/lkml/2016/1/7/363
      Link: https://lkml.org/lkml/2016/5/27/486
      Acked-by: Srinivas Kandagatla <srinivas.kandagatla@linaro.org> # For nvmem part
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      287980e4
  8. 18 5月, 2016 4 次提交
  9. 17 5月, 2016 1 次提交
  10. 13 5月, 2016 5 次提交
  11. 12 5月, 2016 5 次提交
  12. 11 5月, 2016 2 次提交
    • A
      cpufreq: powernv: del_timer_sync when global and local pstate are equal · 0bc10b93
      Akshay Adiga 提交于
      When global and local pstate are equal in a powernv_target_index() call,
      we don't queue a timer. But we may have timer already queued for future.
      This could cause the timer to fire one additional time for no use.
      Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      0bc10b93
    • A
      cpufreq: powernv: Move smp_call_function_any() out of irq safe block · 1fd3ff28
      Akshay Adiga 提交于
      Fix a WARN_ON caused by smp_call_function_any() when irq is disabled,
      because of changes made in the patch ('cpufreq: powernv: Ramp-down
       global pstate slower than local-pstate')
      https://patchwork.ozlabs.org/patch/612058/
      
       WARNING: CPU: 0 PID: 4 at kernel/smp.c:291
      smp_call_function_single+0x170/0x180
      
       Call Trace:
       [c0000007f648f9f0] [c0000007f648fa90] 0xc0000007f648fa90 (unreliable)
       [c0000007f648fa30] [c0000000001430e0] smp_call_function_any+0x170/0x1c0
       [c0000007f648fa90] [c0000000007b4b00]
      powernv_cpufreq_target_index+0xe0/0x250
       [c0000007f648fb00] [c0000000007ac9dc]
      __cpufreq_driver_target+0x20c/0x3d0
       [c0000007f648fbc0] [c0000000007b1b4c] od_dbs_timer+0xcc/0x260
       [c0000007f648fc10] [c0000000007b3024] dbs_work_handler+0x54/0xa0
       [c0000007f648fc50] [c0000000000c49a8] process_one_work+0x1d8/0x590
       [c0000007f648fce0] [c0000000000c4e08] worker_thread+0xa8/0x660
       [c0000007f648fd80] [c0000000000cca88] kthread+0x108/0x130
       [c0000007f648fe30] [c0000000000095e8] ret_from_kernel_thread+0x5c/0x74
      
      - Calling smp_call_function_any() with interrupt disabled (through
       spin_lock_irqsave) could cause a deadlock, as smp_call_function_any()
       relies on the IPI to complete. This is detected in the
       smp_call_function_any() call and hence the WARN_ON.
      
      - As the spinlock (gpstates->lock) is only used to synchronize access of
       global_pstate_info  between timer irq handler and target_index calls. And
       the timer irq handler just try_locks() hence it would not cause a
       deadlock. Hence could do without making spinlocks irq safe.
      
      - As the smp_call_function_any() is a blocking call and does not access
       global_pstates_info, it could reduce the critcal section by moving
       smp_call_function_any() after giving up the lock.
      Reported-by: NAbdul Haleem <abdhalee@linux.vnet.linux.com>
      Signed-off-by: NAkshay Adiga <akshay.adiga@linux.vnet.ibm.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      1fd3ff28
  13. 10 5月, 2016 1 次提交
  14. 07 5月, 2016 1 次提交
  15. 06 5月, 2016 1 次提交
    • R
      cpufreq: governor: Fix handling of special cases in dbs_update() · 9485e4ca
      Rafael J. Wysocki 提交于
      As reported in KBZ 69821:
      
      "With CONFIG_HZ_PERIODIC=y cpu stays at the lowest frequcency 800MHz
       even if usage goes to 100%, frequency does not scale up, the governor
       in use is ondemand. Neither works conservative. Performance and
       userspace governors work as expected.
      
       With CONFIG_NO_HZ_IDLE or CONFIG_NO_HZ_FULL cpu scales up with ondemand
       as expected."
      
      Analysis carried out by Chen Yu leads to the conclusion that the
      observed issue is due to idle_time in dbs_update() representing a
      negative number in which case the function will return 0 as the load
      (unless load is greater than 0 for another CPU sharing the policy),
      although that need not be the right choice.
      
      Indeed, idle_time representing a negative number means that during
      the last sampling interval the CPU was almost 100% busy on the rough
      average, so 100 should be returned as the load in that case.
      
      Modify the code accordingly and rearrange it to clarify the handling
      of all of the special cases in it.  While at it, also avoid returning
      zero as the load if time_elapsed is 0 (it doesn't really make sense
      to return 0 then).
      
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=69821Tested-by: NChen Yu <yu.c.chen@intel.com>
      Tested-by: NTimo Valtoaho <timo.valtoaho@gmail.com>
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Acked-by: NViresh Kumar <viresh.kumar@linaro.org>
      9485e4ca
  16. 05 5月, 2016 4 次提交
  17. 04 5月, 2016 1 次提交
    • R
      intel_pstate: Fix intel_pstate_get() · 6d45b719
      Rafael J. Wysocki 提交于
      After commit 8fa520af "intel_pstate: Remove freq calculation from
      intel_pstate_calc_busy()" intel_pstate_get() calls get_avg_frequency()
      to compute the average frequency, which is problematic for two reasons.
      
      First, intel_pstate_get() may be invoked before the driver reads the
      CPU feedback registers for the first time and if that happens,
      get_avg_frequency() will attempt to divide by zero.
      
      Second, the get_avg_frequency() call in intel_pstate_get() is racy
      with respect to intel_pstate_sample() and it may end up returning
      completely meaningless values for this reason.
      
      Moreover, after commit 7349ec04 "intel_pstate: Move
      intel_pstate_calc_busy() into get_target_pstate_use_performance()"
      sample.core_pct_busy is never computed on Atom, but it is used in
      intel_pstate_adjust_busy_pstate() in that case too.
      
      To address those problems notice that if sample.core_pct_busy
      was used in the average frequency computation carried out by
      get_avg_frequency(), both the divide by zero problem and the
      race with respect to intel_pstate_sample() would be avoided.
      
      Accordingly, move the invocation of intel_pstate_calc_busy() from
      get_target_pstate_use_performance() to intel_pstate_update_util(),
      which also will take care of the uninitialized sample.core_pct_busy
      on Atom, and modify get_avg_frequency() to use sample.core_pct_busy
      as per the above.
      Reported-by: Nkernel test robot <ying.huang@linux.intel.com>
      Link: http://marc.info/?l=linux-kernel&m=146226437623173&w=4
      Fixes: 8fa520af "intel_pstate: Remove freq calculation from intel_pstate_calc_busy()"
      Fixes: 7349ec04 "intel_pstate: Move intel_pstate_calc_busy() into get_target_pstate_use_performance()"
      Signed-off-by: NRafael J. Wysocki <rafael.j.wysocki@intel.com>
      6d45b719
  18. 02 5月, 2016 1 次提交
  19. 01 5月, 2016 1 次提交
  20. 28 4月, 2016 5 次提交