1. 18 10月, 2021 10 次提交
  2. 16 6月, 2021 1 次提交
  3. 04 6月, 2021 1 次提交
  4. 08 4月, 2021 2 次提交
  5. 07 4月, 2021 2 次提交
    • J
      ptp: arm/arm64: Enable ptp_kvm for arm/arm64 · 300bb1fe
      Jianyong Wu 提交于
      Currently, there is no mechanism to keep time sync between guest and host
      in arm/arm64 virtualization environment. Time in guest will drift compared
      with host after boot up as they may both use third party time sources
      to correct their time respectively. The time deviation will be in order
      of milliseconds. But in some scenarios,like in cloud environment, we ask
      for higher time precision.
      
      kvm ptp clock, which chooses the host clock source as a reference
      clock to sync time between guest and host, has been adopted by x86
      which takes the time sync order from milliseconds to nanoseconds.
      
      This patch enables kvm ptp clock for arm/arm64 and improves clock sync precision
      significantly.
      
      Test result comparisons between with kvm ptp clock and without it in arm/arm64
      are as follows. This test derived from the result of command 'chronyc
      sources'. we should take more care of the last sample column which shows
      the offset between the local clock and the source at the last measurement.
      
      no kvm ptp in guest:
      MS Name/IP address   Stratum Poll Reach LastRx Last sample
      ========================================================================
      ^* dns1.synet.edu.cn      2   6   377    13  +1040us[+1581us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377    21  +1040us[+1581us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377    29  +1040us[+1581us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377    37  +1040us[+1581us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377    45  +1040us[+1581us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377    53  +1040us[+1581us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377    61  +1040us[+1581us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377     4   -130us[ +796us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377    12   -130us[ +796us] +/-   21ms
      ^* dns1.synet.edu.cn      2   6   377    20   -130us[ +796us] +/-   21ms
      
      in host:
      MS Name/IP address   Stratum Poll Reach LastRx Last sample
      ========================================================================
      ^* 120.25.115.20          2   7   377    72   -470us[ -603us] +/-   18ms
      ^* 120.25.115.20          2   7   377    92   -470us[ -603us] +/-   18ms
      ^* 120.25.115.20          2   7   377   112   -470us[ -603us] +/-   18ms
      ^* 120.25.115.20          2   7   377     2   +872ns[-6808ns] +/-   17ms
      ^* 120.25.115.20          2   7   377    22   +872ns[-6808ns] +/-   17ms
      ^* 120.25.115.20          2   7   377    43   +872ns[-6808ns] +/-   17ms
      ^* 120.25.115.20          2   7   377    63   +872ns[-6808ns] +/-   17ms
      ^* 120.25.115.20          2   7   377    83   +872ns[-6808ns] +/-   17ms
      ^* 120.25.115.20          2   7   377   103   +872ns[-6808ns] +/-   17ms
      ^* 120.25.115.20          2   7   377   123   +872ns[-6808ns] +/-   17ms
      
      The dns1.synet.edu.cn is the network reference clock for guest and
      120.25.115.20 is the network reference clock for host. we can't get the
      clock error between guest and host directly, but a roughly estimated value
      will be in order of hundreds of us to ms.
      
      with kvm ptp in guest:
      chrony has been disabled in host to remove the disturb by network clock.
      
      MS Name/IP address         Stratum Poll Reach LastRx Last sample
      ========================================================================
      * PHC0                    0   3   377     8     -7ns[   +1ns] +/-    3ns
      * PHC0                    0   3   377     8     +1ns[  +16ns] +/-    3ns
      * PHC0                    0   3   377     6     -4ns[   -0ns] +/-    6ns
      * PHC0                    0   3   377     6     -8ns[  -12ns] +/-    5ns
      * PHC0                    0   3   377     5     +2ns[   +4ns] +/-    4ns
      * PHC0                    0   3   377    13     +2ns[   +4ns] +/-    4ns
      * PHC0                    0   3   377    12     -4ns[   -6ns] +/-    4ns
      * PHC0                    0   3   377    11     -8ns[  -11ns] +/-    6ns
      * PHC0                    0   3   377    10    -14ns[  -20ns] +/-    4ns
      * PHC0                    0   3   377     8     +4ns[   +5ns] +/-    4ns
      
      The PHC0 is the ptp clock which choose the host clock as its source
      clock. So we can see that the clock difference between host and guest
      is in order of ns.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Acked-by: NRichard Cochran <richardcochran@gmail.com>
      Signed-off-by: NJianyong Wu <jianyong.wu@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20201209060932.212364-8-jianyong.wu@arm.com
      300bb1fe
    • J
      clocksource: Add clocksource id for arm arch counter · 100148d0
      Jianyong Wu 提交于
      Add clocksource id to the ARM generic counter so that it can be easily
      identified from callers such as ptp_kvm.
      
      Cc: Mark Rutland <mark.rutland@arm.com>
      Reviewed-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NJianyong Wu <jianyong.wu@arm.com>
      Signed-off-by: NMarc Zyngier <maz@kernel.org>
      Link: https://lore.kernel.org/r/20201209060932.212364-6-jianyong.wu@arm.com
      100148d0
  6. 06 12月, 2020 2 次提交
  7. 09 7月, 2020 2 次提交
  8. 23 5月, 2020 1 次提交
  9. 07 3月, 2020 1 次提交
  10. 26 2月, 2020 1 次提交
  11. 18 2月, 2020 1 次提交
  12. 26 6月, 2019 1 次提交
  13. 19 6月, 2019 1 次提交
  14. 12 6月, 2019 1 次提交
    • J
      clocksource/drivers/arm_arch_timer: Don't trace count reader functions · 5d6168fc
      Julien Thierry 提交于
      With v5.2-rc1, The ftrace functions_graph tracer locks up whenever it is
      enabled on arm64.
      
      Since commit 0ea41539 ("clocksource/arm_arch_timer: Use
      arch_timer_read_counter to access stable counters") a function pointer
      is consistently used to read the counter instead of potentially
      referencing an inlinable function.
      
      The graph tracers relies on accessing the timer counters to compute the
      time spent in functions which causes the lockup when attempting to trace
      these code paths.
      
      Annotate the arm arch timer counter accessors as notrace.
      
      Fixes: 0ea41539 ("clocksource/arm_arch_timer: Use
             arch_timer_read_counter to access stable counters")
      Signed-off-by: NJulien Thierry <julien.thierry@arm.com>
      Cc: Marc Zyngier <marc.zyngier@arm.com>
      Cc: Mark Rutland <mark.rutland@arm.com>
      Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Acked-by: NMarc Zyngier <marc.zyngier@arm.com>
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      5d6168fc
  15. 30 4月, 2019 4 次提交
  16. 16 4月, 2019 1 次提交
    • A
      arm64: HWCAP: add support for AT_HWCAP2 · aaba098f
      Andrew Murray 提交于
      As we will exhaust the first 32 bits of AT_HWCAP let's start
      exposing AT_HWCAP2 to userspace to give us up to 64 caps.
      
      Whilst it's possible to use the remaining 32 bits of AT_HWCAP, we
      prefer to expand into AT_HWCAP2 in order to provide a consistent
      view to userspace between ILP32 and LP64. However internal to the
      kernel we prefer to continue to use the full space of elf_hwcap.
      
      To reduce complexity and allow for future expansion, we now
      represent hwcaps in the kernel as ordinals and use a
      KERNEL_HWCAP_ prefix. This allows us to support automatic feature
      based module loading for all our hwcaps.
      
      We introduce cpu_set_feature to set hwcaps which complements the
      existing cpu_have_feature helper. These helpers allow us to clean
      up existing direct uses of elf_hwcap and reduce any future effort
      required to move beyond 64 caps.
      
      For convenience we also introduce cpu_{have,set}_named_feature which
      makes use of the cpu_feature macro to allow providing a hwcap name
      without a {KERNEL_}HWCAP_ prefix.
      Signed-off-by: NAndrew Murray <andrew.murray@arm.com>
      [will: use const_ilog2() and tweak documentation]
      Signed-off-by: NWill Deacon <will.deacon@arm.com>
      aaba098f
  17. 12 4月, 2019 1 次提交
  18. 23 2月, 2019 1 次提交
    • S
      clocksource/drivers/arch_timer: Workaround for Allwinner A64 timer instability · c950ca8c
      Samuel Holland 提交于
      The Allwinner A64 SoC is known[1] to have an unstable architectural
      timer, which manifests itself most obviously in the time jumping forward
      a multiple of 95 years[2][3]. This coincides with 2^56 cycles at a
      timer frequency of 24 MHz, implying that the time went slightly backward
      (and this was interpreted by the kernel as it jumping forward and
      wrapping around past the epoch).
      
      Investigation revealed instability in the low bits of CNTVCT at the
      point a high bit rolls over. This leads to power-of-two cycle forward
      and backward jumps. (Testing shows that forward jumps are about twice as
      likely as backward jumps.) Since the counter value returns to normal
      after an indeterminate read, each "jump" really consists of both a
      forward and backward jump from the software perspective.
      
      Unless the kernel is trapping CNTVCT reads, a userspace program is able
      to read the register in a loop faster than it changes. A test program
      running on all 4 CPU cores that reported jumps larger than 100 ms was
      run for 13.6 hours and reported the following:
      
       Count | Event
      -------+---------------------------
        9940 | jumped backward      699ms
         268 | jumped backward     1398ms
           1 | jumped backward     2097ms
       16020 | jumped forward       175ms
        6443 | jumped forward       699ms
        2976 | jumped forward      1398ms
           9 | jumped forward    356516ms
           9 | jumped forward    357215ms
           4 | jumped forward    714430ms
           1 | jumped forward   3578440ms
      
      This works out to a jump larger than 100 ms about every 5.5 seconds on
      each CPU core.
      
      The largest jump (almost an hour!) was the following sequence of reads:
          0x0000007fffffffff → 0x00000093feffffff → 0x0000008000000000
      
      Note that the middle bits don't necessarily all read as all zeroes or
      all ones during the anomalous behavior; however the low 10 bits checked
      by the function in this patch have never been observed with any other
      value.
      
      Also note that smaller jumps are much more common, with backward jumps
      of 2048 (2^11) cycles observed over 400 times per second on each core.
      (Of course, this is partially explained by lower bits rolling over more
      frequently.) Any one of these could have caused the 95 year time skip.
      
      Similar anomalies were observed while reading CNTPCT (after patching the
      kernel to allow reads from userspace). However, the CNTPCT jumps are
      much less frequent, and only small jumps were observed. The same program
      as before (except now reading CNTPCT) observed after 72 hours:
      
       Count | Event
      -------+---------------------------
          17 | jumped backward      699ms
          52 | jumped forward       175ms
        2831 | jumped forward       699ms
           5 | jumped forward      1398ms
      
      Further investigation showed that the instability in CNTPCT/CNTVCT also
      affected the respective timer's TVAL register. The following values were
      observed immediately after writing CNVT_TVAL to 0x10000000:
      
       CNTVCT             | CNTV_TVAL  | CNTV_CVAL          | CNTV_TVAL Error
      --------------------+------------+--------------------+-----------------
       0x000000d4a2d8bfff | 0x10003fff | 0x000000d4b2d8bfff | +0x00004000
       0x000000d4a2d94000 | 0x0fffffff | 0x000000d4b2d97fff | -0x00004000
       0x000000d4a2d97fff | 0x10003fff | 0x000000d4b2d97fff | +0x00004000
       0x000000d4a2d9c000 | 0x0fffffff | 0x000000d4b2d9ffff | -0x00004000
      
      The pattern of errors in CNTV_TVAL seemed to depend on exactly which
      value was written to it. For example, after writing 0x10101010:
      
       CNTVCT             | CNTV_TVAL  | CNTV_CVAL          | CNTV_TVAL Error
      --------------------+------------+--------------------+-----------------
       0x000001ac3effffff | 0x1110100f | 0x000001ac4f10100f | +0x1000000
       0x000001ac40000000 | 0x1010100f | 0x000001ac5110100f | -0x1000000
       0x000001ac58ffffff | 0x1110100f | 0x000001ac6910100f | +0x1000000
       0x000001ac66000000 | 0x1010100f | 0x000001ac7710100f | -0x1000000
       0x000001ac6affffff | 0x1110100f | 0x000001ac7b10100f | +0x1000000
       0x000001ac6e000000 | 0x1010100f | 0x000001ac7f10100f | -0x1000000
      
      I was also twice able to reproduce the issue covered by Allwinner's
      workaround[4], that writing to TVAL sometimes fails, and both CVAL and
      TVAL are left with entirely bogus values. One was the following values:
      
       CNTVCT             | CNTV_TVAL  | CNTV_CVAL
      --------------------+------------+--------------------------------------
       0x000000d4a2d6014c | 0x8fbd5721 | 0x000000d132935fff (615s in the past)
      Reviewed-by: NMarc Zyngier <marc.zyngier@arm.com>
      
      ========================================================================
      
      Because the CPU can read the CNTPCT/CNTVCT registers faster than they
      change, performing two reads of the register and comparing the high bits
      (like other workarounds) is not a workable solution. And because the
      timer can jump both forward and backward, no pair of reads can
      distinguish a good value from a bad one. The only way to guarantee a
      good value from consecutive reads would be to read _three_ times, and
      take the middle value only if the three values are 1) each unique and
      2) increasing. This takes at minimum 3 counter cycles (125 ns), or more
      if an anomaly is detected.
      
      However, since there is a distinct pattern to the bad values, we can
      optimize the common case (1022/1024 of the time) to a single read by
      simply ignoring values that match the error pattern. This still takes no
      more than 3 cycles in the worst case, and requires much less code. As an
      additional safety check, we still limit the loop iteration to the number
      of max-frequency (1.2 GHz) CPU cycles in three 24 MHz counter periods.
      
      For the TVAL registers, the simple solution is to not use them. Instead,
      read or write the CVAL and calculate the TVAL value in software.
      
      Although the manufacturer is aware of at least part of the erratum[4],
      there is no official name for it. For now, use the kernel-internal name
      "UNKNOWN1".
      
      [1]: https://github.com/armbian/build/commit/a08cd6fe7ae9
      [2]: https://forum.armbian.com/topic/3458-a64-datetime-clock-issue/
      [3]: https://irclog.whitequark.org/linux-sunxi/2018-01-26
      [4]: https://github.com/Allwinner-Homlet/H6-BSP4.9-linux/blob/master/drivers/clocksource/arm_arch_timer.c#L272Acked-by: NMaxime Ripard <maxime.ripard@bootlin.com>
      Tested-by: NAndre Przywara <andre.przywara@arm.com>
      Signed-off-by: NSamuel Holland <samuel@sholland.org>
      Cc: stable@vger.kernel.org
      Signed-off-by: NDaniel Lezcano <daniel.lezcano@linaro.org>
      c950ca8c
  19. 20 2月, 2019 1 次提交
  20. 01 10月, 2018 1 次提交
  21. 02 8月, 2018 1 次提交
  22. 11 7月, 2018 1 次提交
  23. 06 11月, 2017 2 次提交