1. 27 12月, 2019 1 次提交
    • A
      clocksource/drivers/arc_timer: Utilize generic sched_clock · f3966840
      Alexey Brodkin 提交于
      commit bf287607c80f24387fedb431a346dc67f25be12c upstream.
      
      It turned out we used to use default implementation of sched_clock()
      from kernel/sched/clock.c which was as precise as 1/HZ, i.e.
      by default we had 10 msec granularity of time measurement.
      
      Now given ARC built-in timers are clocked with the same frequency as
      CPU cores we may get much higher precision of time tracking.
      
      Thus we switch to generic sched_clock which really reads ARC hardware
      counters.
      
      This is especially helpful for measuring short events.
      That's what we used to have:
      ------------------------------>8------------------------
      $ perf stat /bin/sh -c /root/lmbench-master/bin/arc/hello > /dev/null
      
       Performance counter stats for '/bin/sh -c /root/lmbench-master/bin/arc/hello':
      
               10.000000      task-clock (msec)         #    2.832 CPUs utilized
                       1      context-switches          #    0.100 K/sec
                       1      cpu-migrations            #    0.100 K/sec
                      63      page-faults               #    0.006 M/sec
                 3049480      cycles                    #    0.305 GHz
                 1091259      instructions              #    0.36  insn per cycle
                  256828      branches                  #   25.683 M/sec
                   27026      branch-misses             #   10.52% of all branches
      
             0.003530687 seconds time elapsed
      
             0.000000000 seconds user
             0.010000000 seconds sys
      ------------------------------>8------------------------
      
      And now we'll see:
      ------------------------------>8------------------------
      $ perf stat /bin/sh -c /root/lmbench-master/bin/arc/hello > /dev/null
      
       Performance counter stats for '/bin/sh -c /root/lmbench-master/bin/arc/hello':
      
                3.004322      task-clock (msec)         #    0.865 CPUs utilized
                       1      context-switches          #    0.333 K/sec
                       1      cpu-migrations            #    0.333 K/sec
                      63      page-faults               #    0.021 M/sec
                 2986734      cycles                    #    0.994 GHz
                 1087466      instructions              #    0.36  insn per cycle
                  255209      branches                  #   84.947 M/sec
                   26002      branch-misses             #   10.19% of all branches
      
             0.003474829 seconds time elapsed
      
             0.003519000 seconds user
             0.000000000 seconds sys
      ------------------------------>8------------------------
      
      Note how much more meaningful is the second output - time spent for
      execution pretty much matches number of cycles spent (we're runnign
      @ 1GHz here).
      Signed-off-by: NAlexey Brodkin <abrodkin@synopsys.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      Signed-off-by: NYang Yingliang <yangyingliang@huawei.com>
      f3966840
  2. 19 5月, 2018 1 次提交
  3. 28 2月, 2018 1 次提交
  4. 14 6月, 2017 1 次提交
  5. 07 4月, 2017 1 次提交
  6. 25 12月, 2016 2 次提交
  7. 01 12月, 2016 7 次提交
  8. 08 11月, 2016 1 次提交
  9. 15 7月, 2016 1 次提交
  10. 28 6月, 2016 2 次提交
  11. 09 5月, 2016 5 次提交
    • V
      ARC: clocksource: DT based probe · e608b53e
      Vineet Gupta 提交于
      - Remove explicit clocksource setup and let it be done by OF framework
        by defining CLOCKSOURCE_OF_DECLARE() for various timers
      
      - This allows multiple clocksources to be potentially registered
        simultaneouly: previously we could only do one - as all of them had
        same arc_counter_setup() routine for registration
      
      - Setup routines also ensure that the underlying timer actually exists.
      
      - Remove some of the panic() calls if underlying timer is NOT detected as
        fallback clocksource might still be available
        1. If GRFC doesn't exist, jiffies clocksource gets registered anyways
        2. if RTC doesn't exist, TIMER1 can take over (as it is always
           present)
      
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      e608b53e
    • V
      ARC: clockevent: DT based probe · 77c8d0d6
      Vineet Gupta 提交于
       - timer frequency is derived from DT (no longer rely on top level
         DT "clock-frequency" probed early and exported by asm/clk.h)
      
       - TIMER0_IRQ need not be exported across arch code, confined to intc as
         it is property of same
      
       - Any failures in clockevent setup are considered pedantic and system
         panic()'s as there is no generic fallback (unlike clocksource where
         a jiffies based soft clocksource always exists)
      Acked-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      77c8d0d6
    • N
      ARC: clockevent: Prepare for DT based probe · 69fbd098
      Noam Camus 提交于
       - call clocksource_probe()
       - This in turns needs of_clk_init() to be called earlier
      
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Signed-off-by: NNoam Camus <noamc@ezchip.com>
      [vgupta: broken off from a bigger patch]
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      69fbd098
    • N
      ARC: clockevent: switch to cpu notifier for clockevent setup · eec3c58e
      Noam Camus 提交于
      ARC Timers so far have been handled as "legacy" w/o explicit description
      in DT. This poses challenge for newer platforms wanting to use them.
      This series will eventually help move timers over to DT.
      
      This patch does a small change of using a CPU notifier to set clockevent
      on non-boot CPUs. So explicit setup is done only on boot CPU (which will
      later be done by DT)
      Signed-off-by: NNoam Camus <noamc@ezchip.com>
      [vgupta: broken off from a bigger patch]
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      eec3c58e
    • V
      ARC: opencode arc_request_percpu_irq · 56957940
      Vineet Gupta 提交于
      - The idea is to remove the API usage since it has a subltle
        design flaw - relies on being called on cpu0 first. This is true for
        some early per cpu irqs such as TIMER/IPI, but not for late probed
        per cpu peripherals such a perf. And it's usage in perf has already
        bitten us once: see c6317bc7
        ("ARCv2: perf: Ensure perf intr gets enabled on all cores") where we
        ended up open coding it anyways
      
      - The seeming duplication will go away once we start using cpu notifier
        for timer setup
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      56957940
  12. 11 3月, 2016 1 次提交
  13. 29 1月, 2016 1 次提交
  14. 28 10月, 2015 1 次提交
  15. 20 7月, 2015 1 次提交
  16. 22 6月, 2015 2 次提交
  17. 19 6月, 2015 2 次提交
  18. 23 7月, 2014 2 次提交
  19. 03 6月, 2014 1 次提交
  20. 26 3月, 2014 2 次提交
  21. 07 11月, 2013 1 次提交
  22. 06 11月, 2013 2 次提交
    • V
      ARC: use __weak instead of __attribute__((weak)) · 064a6269
      Vineet Gupta 提交于
      Signed-off-by: NVineet Gupta <vgupta@synopsys.com>
      064a6269
    • C
      arc: Replace __get_cpu_var uses · 6855e95c
      Christoph Lameter 提交于
      __get_cpu_var() is used for multiple purposes in the kernel source. One of them is
      address calculation via the form &__get_cpu_var(x). This calculates the address for
      the instance of the percpu variable of the current processor based on an offset.
      
      Other use cases are for storing and retrieving data from the current processors percpu area.
      __get_cpu_var() can be used as an lvalue when writing data or on the right side of an assignment.
      
      __get_cpu_var() is defined as :
      
      #define __get_cpu_var(var) (*this_cpu_ptr(&(var)))
      
      __get_cpu_var() always only does an address determination. However, store and retrieve operations
      could use a segment prefix (or global register on other platforms) to avoid the address calculation.
      
      this_cpu_write() and this_cpu_read() can directly take an offset into a percpu area and use
      optimized assembly code to read and write per cpu variables.
      
      This patch converts __get_cpu_var into either an explicit address calculation using this_cpu_ptr()
      or into a use of this_cpu operations that use the offset. Thereby address calcualtions are avoided
      and less registers are used when code is generated.
      
      At the end of the patchset all uses of __get_cpu_var have been removed so the macro is removed too.
      
      The patchset includes passes over all arches as well. Once these operations are used throughout then
      specialized macros can be defined in non -x86 arches as well in order to optimize per cpu access by
      f.e. using a global register that may be set to the per cpu base.
      
      Transformations done to __get_cpu_var()
      
      1. Determine the address of the percpu instance of the current processor.
      
      	DEFINE_PER_CPU(int, y);
      	int *x = &__get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(&y);
      
      2. Same as #1 but this time an array structure is involved.
      
      	DEFINE_PER_CPU(int, y[20]);
      	int *x = __get_cpu_var(y);
      
          Converts to
      
      	int *x = this_cpu_ptr(y);
      
      3. Retrieve the content of the current processors instance of a per cpu variable.
      
      	DEFINE_PER_CPU(int, u);
      	int x = __get_cpu_var(y)
      
         Converts to
      
      	int x = __this_cpu_read(y);
      
      4. Retrieve the content of a percpu struct
      
      	DEFINE_PER_CPU(struct mystruct, y);
      	struct mystruct x = __get_cpu_var(y);
      
         Converts to
      
      	memcpy(this_cpu_ptr(&x), y, sizeof(x));
      
      5. Assignment to a per cpu variable
      
      	DEFINE_PER_CPU(int, y)
      	__get_cpu_var(y) = x;
      
         Converts to
      
      	this_cpu_write(y, x);
      
      6. Increment/Decrement etc of a per cpu variable
      
      	DEFINE_PER_CPU(int, y);
      	__get_cpu_var(y)++
      
         Converts to
      
      	this_cpu_inc(y)
      Acked-by: NVineet Gupta <vgupta@synopsys.com>
      Signed-off-by: NChristoph Lameter <cl@linux.com>
      6855e95c
  23. 27 9月, 2013 1 次提交