- 30 12月, 2021 40 次提交
-
-
mainline inclusion from mainline-v5.14-rc1 commit 19c3eaa7 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- Dab Carpenter reported that: The patch bce29ac9: "trace: Add osnoise tracer" from Jun 22, 2021, leads to the following static checker warning: kernel/trace/trace_osnoise.c:1103 run_osnoise() warn: unsigned 'noise' is never less than zero. In this part of the code: 1100 /* 1101 * This shouldn't happen. 1102 */ 1103 if (noise < 0) { ^^^^^^^^^ 1104 osnoise_taint("negative noise!"); 1105 goto out; 1106 } 1107 And the static checker is right because 'noise' is u64. Make noise s64 and keep the check. It is important to check if the time read is behaving correctly - so we can trust the results. I also re-arranged some variable declarations. Link: https://lkml.kernel.org/r/acd7cd6e7d56b798a298c3bc8139a390b3c4ab52.1624986368.git.bristot@redhat.com Cc: Ingo Molnar <mingo@redhat.com> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Dan Carpenter <dan.carpenter@oracle.com> Cc: linux-kernel@vger.kernel.org Fixes: bce29ac9 ("trace: Add osnoise tracer") Reported-by: NDan Carpenter <dan.carpenter@oracle.com> Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Colin Ian King 提交于
mainline inclusion from mainline-v5.14-rc1 commit b62613b4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- There is a spelling mistake in a TP_printk message, the word interferences is not the plural of interference. Fix this. Link: https://lkml.kernel.org/r/20210628125522.56361-1-colin.king@canonical.comReviewed-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NColin Ian King <colin.king@canonical.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc1 commit bd09c055 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- s/RUNTIME IN USE/RUNTIME IN US/ Link: https://lkml.kernel.org/r/43e5160422a967218aa651c47f523e8d32d6a59e.1624872608.git.bristot@redhat.com Fixes: bce29ac9 ("trace: Add osnoise tracer") Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc1 commit 498627b4 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- kernel test robot reported: >> kernel/trace/trace_osnoise.c:1584:2: error: void function 'osnoise_init_hotplug_support' should not return a value [-Wreturn-type] return 0; When !CONFIG_HOTPLUG_CPU. Fix it problem by removing the return value. Link: https://lkml.kernel.org/r/c7fc67f1a117cc88bab2e508c898634872795341.1624872608.git.bristot@redhat.com Fixes: c8895e27 ("trace/osnoise: Support hotplug operations") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc1 commit 2a81afa3 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- kernel test robot reported: >> kernel/trace/trace_osnoise.c:966:3: warning: comparison of distinct pointer types ('typeof ((interval)) *' (aka 'long long *') and 'uint64_t *' (aka 'unsigned long long *')) [-Wcompare-distinct-pointer-types] do_div(interval, USEC_PER_MSEC); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ include/asm-generic/div64.h:228:28: note: expanded from macro 'do_div' (void)(((typeof((n)) *)0) == ((uint64_t *)0)); \ ~~~~~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~ As interval cannot be negative because sample_period >= sample_runtime, making interval u64 on osnoise_main() is enough to fix this problem. Link: https://lkml.kernel.org/r/4ae1e7780563598563de079a3ef6d4d10b5f5546.1624872608.git.bristot@redhat.com Fixes: bce29ac9 ("trace: Add osnoise tracer") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc1 commit c8895e27 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- Enable and disable osnoise/timerlat thread during on CPU hotplug online and offline operations respectivelly. Link: https://lore.kernel.org/linux-doc/20210621134636.5b332226@oasis.local.home/ Link: https://lkml.kernel.org/r/39f98590b3caeb3c32f09526214058efe0e9272a.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Suggested-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc1 commit f7d9f637 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- kernel test robot reported some osnoise functions with "no previous prototype." Fix these warnings by making local functions static, and by adding: void osnoise_trace_irq_entry(int id); void osnoise_trace_irq_exit(int id, const char *desc); to include/linux/trace.h. Link: https://lkml.kernel.org/r/e40d3cb4be8bde921f4b40fa6a095cf85ab807bd.1624872608.git.bristot@redhat.com Fixes: bce29ac9 ("trace: Add osnoise tracer") Reported-by: Nkernel test robot <lkp@intel.com> Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.16-rc1 commit 9bd98576 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- s/CONFIG_OSNOISE_TRAECR/CONFIG_OSNOISE_TRACER/ No functional changes. Link: https://lkml.kernel.org/r/33924a16f6e5559ce24952ca7d62561604bfd94a.1634308385.git.bristot@kernel.org Cc: Daniel Bristot de Oliveira <bristot@kernel.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NDaniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc7 commit d03721a6 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- Some extra flags are printed to the trace header when using the PREEMPT_RT config. The extra flags are: need-resched-lazy, preempt-lazy-depth, and migrate-disable. Without printing these fields, the osnoise specific fields are shifted by three positions, for example: # tracer: osnoise # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth MAX # || / SINGLE Interference counters: # |||| RUNTIME NOISE %% OF CPU NOISE +-----------------------------+ # TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD # | | | |||| | | | | | | | | | | <...>-741 [000] ....... 1105.690909: 1000000 234 99.97660 36 21 0 1001 22 3 <...>-742 [001] ....... 1105.691923: 1000000 281 99.97190 197 7 0 1012 35 14 <...>-743 [002] ....... 1105.691958: 1000000 1324 99.86760 118 11 0 1016 155 143 <...>-744 [003] ....... 1105.691998: 1000000 109 99.98910 21 4 0 1004 33 7 <...>-745 [004] ....... 1105.692015: 1000000 2023 99.79770 97 37 0 1023 52 18 Add a new header for osnoise with the missing fields, to be used when the PREEMPT_RT is enabled. Link: https://lkml.kernel.org/r/1f03289d2a51fde5a58c2e7def063dc630820ad1.1626598844.git.bristot@kernel.org Cc: Tom Zanussi <zanussi@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Masami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NDaniel Bristot de Oliveira <bristot@kernel.org> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc1 commit a955d7ea category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- The timerlat tracer aims to help the preemptive kernel developers to found souces of wakeup latencies of real-time threads. Like cyclictest, the tracer sets a periodic timer that wakes up a thread. The thread then computes a *wakeup latency* value as the difference between the *current time* and the *absolute time* that the timer was set to expire. The main goal of timerlat is tracing in such a way to help kernel developers. Usage Write the ASCII text "timerlat" into the current_tracer file of the tracing system (generally mounted at /sys/kernel/tracing). For example: [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo timerlat > current_tracer It is possible to follow the trace by reading the trace trace file: [root@f32 tracing]# cat trace # tracer: timerlat # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth # || / # |||| ACTIVATION # TASK-PID CPU# |||| TIMESTAMP ID CONTEXT LATENCY # | | | |||| | | | | <idle>-0 [000] d.h1 54.029328: #1 context irq timer_latency 932 ns <...>-867 [000] .... 54.029339: #1 context thread timer_latency 11700 ns <idle>-0 [001] dNh1 54.029346: #1 context irq timer_latency 2833 ns <...>-868 [001] .... 54.029353: #1 context thread timer_latency 9820 ns <idle>-0 [000] d.h1 54.030328: #2 context irq timer_latency 769 ns <...>-867 [000] .... 54.030330: #2 context thread timer_latency 3070 ns <idle>-0 [001] d.h1 54.030344: #2 context irq timer_latency 935 ns <...>-868 [001] .... 54.030347: #2 context thread timer_latency 4351 ns The tracer creates a per-cpu kernel thread with real-time priority that prints two lines at every activation. The first is the *timer latency* observed at the *hardirq* context before the activation of the thread. The second is the *timer latency* observed by the thread, which is the same level that cyclictest reports. The ACTIVATION ID field serves to relate the *irq* execution to its respective *thread* execution. The irq/thread splitting is important to clarify at which context the unexpected high value is coming from. The *irq* context can be delayed by hardware related actions, such as SMIs, NMIs, IRQs or by a thread masking interrupts. Once the timer happens, the delay can also be influenced by blocking caused by threads. For example, by postponing the scheduler execution via preempt_disable(), by the scheduler execution, or by masking interrupts. Threads can also be delayed by the interference from other threads and IRQs. The timerlat can also take advantage of the osnoise: traceevents. For example: [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo timerlat > current_tracer [root@f32 tracing]# echo osnoise > set_event [root@f32 tracing]# echo 25 > osnoise/stop_tracing_total_us [root@f32 tracing]# tail -10 trace cc1-87882 [005] d..h... 548.771078: #402268 context irq timer_latency 1585 ns cc1-87882 [005] dNLh1.. 548.771082: irq_noise: local_timer:236 start 548.771077442 duration 4597 ns cc1-87882 [005] dNLh2.. 548.771083: irq_noise: reschedule:253 start 548.771083017 duration 56 ns cc1-87882 [005] dNLh2.. 548.771086: irq_noise: call_function_single:251 start 548.771083811 duration 2048 ns cc1-87882 [005] dNLh2.. 548.771088: irq_noise: call_function_single:251 start 548.771086814 duration 1495 ns cc1-87882 [005] dNLh2.. 548.771091: irq_noise: call_function_single:251 start 548.771089194 duration 1558 ns cc1-87882 [005] dNLh2.. 548.771094: irq_noise: call_function_single:251 start 548.771091719 duration 1932 ns cc1-87882 [005] dNLh2.. 548.771096: irq_noise: call_function_single:251 start 548.771094696 duration 1050 ns cc1-87882 [005] d...3.. 548.771101: thread_noise: cc1:87882 start 548.771078243 duration 10909 ns timerlat/5-1035 [005] ....... 548.771103: #402268 context thread timer_latency 25960 ns For further information see: Documentation/trace/timerlat-tracer.rst Link: https://lkml.kernel.org/r/71f18efc013e1194bcaea1e54db957de2b19ba62.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc1 commit bce29ac9 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- In the context of high-performance computing (HPC), the Operating System Noise (*osnoise*) refers to the interference experienced by an application due to activities inside the operating system. In the context of Linux, NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the system. Moreover, hardware-related jobs can also cause noise, for example, via SMIs. The osnoise tracer leverages the hwlat_detector by running a similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing all the sources of *osnoise* during its execution. Using the same approach of hwlat, osnoise takes note of the entry and exit point of any source of interferences, increasing a per-cpu interference counter. The osnoise tracer also saves an interference counter for each source of interference. The interference counter for NMI, IRQs, SoftIRQs, and threads is increased anytime the tool observes these interferences' entry events. When a noise happens without any interference from the operating system level, the hardware noise counter increases, pointing to a hardware-related noise. In this way, osnoise can account for any source of interference. At the end of the period, the osnoise tracer prints the sum of all noise, the max single noise, the percentage of CPU available for the thread, and the counters for the noise sources. Usage Write the ASCII text "osnoise" into the current_tracer file of the tracing system (generally mounted at /sys/kernel/tracing). For example:: [root@f32 ~]# cd /sys/kernel/tracing/ [root@f32 tracing]# echo osnoise > current_tracer It is possible to follow the trace by reading the trace trace file:: [root@f32 tracing]# cat trace # tracer: osnoise # # _-----=> irqs-off # / _----=> need-resched # | / _---=> hardirq/softirq # || / _--=> preempt-depth MAX # || / SINGLE Interference counters: # |||| RUNTIME NOISE % OF CPU NOISE +-----------------------------+ # TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD # | | | |||| | | | | | | | | | | <...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1 <...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3 <...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21 <...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0 <...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41 <...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2 <...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1 <...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19 In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the tracer prints a message at the end of each period for each CPU that is running an osnoise/CPU thread. The osnoise specific fields report: - The RUNTIME IN USE reports the amount of time in microseconds that the osnoise thread kept looping reading the time. - The NOISE IN US reports the sum of noise in microseconds observed by the osnoise tracer during the associated runtime. - The % OF CPU AVAILABLE reports the percentage of CPU available for the osnoise thread during the runtime window. - The MAX SINGLE NOISE IN US reports the maximum single noise observed during the runtime window. - The Interference counters display how many each of the respective interference happened during the runtime window. Note that the example above shows a high number of HW noise samples. The reason being is that this sample was taken on a virtual machine, and the host interference is detected as a hardware interference. Tracer options The tracer has a set of options inside the osnoise directory, they are: - osnoise/cpus: CPUs at which a osnoise thread will execute. - osnoise/period_us: the period of the osnoise thread. - osnoise/runtime_us: how long an osnoise thread will look for noise. - osnoise/stop_tracing_us: stop the system tracing if a single noise higher than the configured value happens. Writing 0 disables this option. - osnoise/stop_tracing_total_us: stop the system tracing if total noise higher than the configured value happens. Writing 0 disables this option. - tracing_threshold: the minimum delta between two time() reads to be considered as noise, in us. When set to 0, the default value will be used, which is currently 5 us. Additional Tracing In addition to the tracer, a set of tracepoints were added to facilitate the identification of the osnoise source. - osnoise:sample_threshold: printed anytime a noise is higher than the configurable tolerance_ns. - osnoise:nmi_noise: noise from NMI, including the duration. - osnoise:irq_noise: noise from an IRQ, including the duration. - osnoise:softirq_noise: noise from a SoftIRQ, including the duration. - osnoise:thread_noise: noise from a thread, including the duration. Note that all the values are *net values*. For example, if while osnoise is running, another thread preempts the osnoise thread, it will start a thread_noise duration at the start. Then, an IRQ takes place, preempting the thread_noise, starting a irq_noise. When the IRQ ends its execution, it will compute its duration, and this duration will be subtracted from the thread_noise, in such a way as to avoid the double accounting of the IRQ execution. This logic is valid for all sources of noise. Here is one example of the usage of these tracepoints:: osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8723 ns interferences 2 In this example, a noise sample of 8 microseconds was reported in the last line, pointing to two interferences. Looking backward in the trace, the two previous entries were about the migration thread running after a timer IRQ execution. The first event is not part of the noise because it took place one millisecond before. It is worth noticing that the sum of the duration reported in the tracepoints is smaller than eight us reported in the sample_threshold. The reason roots in the overhead of the entry and exit code that happens before and after any interference execution. This justifies the dual approach: measuring thread and tracing. Link: https://lkml.kernel.org/r/e649467042d60e7b62714c9c6751a56299d15119.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> [ Made the following functions static: trace_irqentry_callback() trace_irqexit_callback() trace_intel_irqentry_callback() trace_intel_irqexit_callback() Added to include/trace.h: osnoise_arch_register() osnoise_arch_unregister() Fixed define logic for LATENCY_FS_NOTIFY Reported-by: Nkernel test robot <lkp@intel.com> ] Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.12-rc1 commit 36590c50 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- The state of the interrupts (irqflags) and the preemption counter are both passed down to tracing_generic_entry_update(). Only one bit of irqflags is actually required: The on/off state. The complete 32bit of the preemption counter isn't needed. Just whether of the upper bits (softirq, hardirq and NMI) are set and the preemption depth is needed. The irqflags and the preemption counter could be evaluated early and the information stored in an integer `trace_ctx'. tracing_generic_entry_update() would use the upper bits as the TRACE_FLAG_* and the lower 8bit as the disabled-preemption depth (considering that one must be substracted from the counter in one special cases). The actual preemption value is not used except for the tracing record. The `irqflags' variable is mostly used only for the tracing record. An exception here is for instance wakeup_tracer_call() or probe_wakeup_sched_switch() which explicilty disable interrupts and use that `irqflags' to save (and restore) the IRQ state and to record the state. Struct trace_event_buffer has also the `pc' and flags' members which can be replaced with `trace_ctx' since their actual value is not used outside of trace recording. This will reduce tracing_generic_entry_update() to simply assign values to struct trace_entry. The evaluation of the TRACE_FLAG_* bits is moved to _tracing_gen_ctx_flags() which replaces preempt_count() and local_save_flags() invocations. As an example, ftrace_syscall_enter() may invoke: - trace_buffer_lock_reserve() -> … -> tracing_generic_entry_update() - event_trigger_unlock_commit() -> ftrace_trace_stack() -> … -> tracing_generic_entry_update() -> ftrace_trace_userstack() -> … -> tracing_generic_entry_update() In this case the TRACE_FLAG_* bits were evaluated three times. By using the `trace_ctx' they are evaluated once and assigned three times. A build with all tracers enabled on x86-64 with and without the patch: text data bss dec hex filename 21970669 17084168 7639260 46694097 2c87ed1 vmlinux.old 21970293 17084168 7639260 46693721 2c87d59 vmlinux.new text shrank by 379 bytes, data remained constant. Link: https://lkml.kernel.org/r/20210125194511.3924915-2-bigeasy@linutronix.deSigned-off-by: NSebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Steven Rostedt 提交于
mainline inclusion from mainline-v5.14-rc1 commit 62de4f29 category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- To have nanosecond output displayed in a more human readable format, its nicer to convert it to a seconds format (XXX.YYYYYYYYY). The problem is that to do so, the numbers must be divided by NSEC_PER_SEC, and moded too. But as these numbers are 64 bit, this can not be done simply with '/' and '%' operators, but must use do_div() instead. Instead of performing the expensive do_div() in the hot path of the tracepoint, it is more efficient to perform it during the output phase. But passing in do_div() can confuse the parser, and do_div() doesn't work exactly like a normal C function. It modifies the number in place, and we don't want to modify the actual values in the ring buffer. Two helper functions are now created: __print_ns_to_secs() and __print_ns_without_secs() They both take a value of nanoseconds, and the former will return that number divided by NSEC_PER_SEC, and the latter will mod it with NSEC_PER_SEC giving a way to print a nice human readable format: __print_fmt("time=%llu.%09u", __print_ns_to_secs(REC->nsec_val), __print_ns_without_secs(REC->nsec_val)) Link: https://lkml.kernel.org/r/e503b903045496c4ccde52843e1e318b422f7a56.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NSteven Rostedt <rostedt@goodmis.org> Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
mainline inclusion from mainline-v5.14-rc1 commit bc87cf0a category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4G64B CVE: NA ------------------------------------------------- The hwlat detector and (in preparation for) the osnoise/timerlat tracers have a set of u64 parameters that the user can read/write via tracefs. For instance, we have hwlat_detector's window and width. To reduce the code duplication, hwlat's window and width share the same read function. However, they do not share the write functions because they do different parameter checks. For instance, the width needs to be smaller than the window, while the window needs to be larger than the window. The same pattern repeats on osnoise/timerlat, and a large portion of the code was devoted to the write function. Despite having different checks, the write functions have the same structure: read a user-space buffer take the lock that protects the value check for minimum and maximum acceptable values save the value release the lock return success or error To reduce the code duplication also in the write functions, this patch provides a generic read and write implementation for u64 values that need to be within some minimum and/or maximum parameters, while (potentially) being protected by a lock. To use this interface, the structure trace_min_max_param needs to be filled: struct trace_min_max_param { struct mutex *lock; u64 *val; u64 *min; u64 *max; }; The desired value is stored on the variable pointed by *val. If *min points to a minimum acceptable value, it will be checked during the write operation. Likewise, if *max points to a maximum allowable value, it will be checked during the write operation. Finally, if *lock points to a mutex, it will be taken at the beginning of the operation and released at the end. The definition of a trace_min_max_param needs to passed as the (private) *data for tracefs_create_file(), and the trace_min_max_fops (added by this patch) as the *fops file_operations. Link: https://lkml.kernel.org/r/3e35760a7c8b5c55f16ae5ad5fc54a0e71cbe647.1624372313.git.bristot@redhat.com Cc: Phil Auld <pauld@redhat.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Kate Carcia <kcarcia@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Clark Willaims <williams@redhat.com> Cc: John Kacur <jkacur@redhat.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: linux-doc@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: NDaniel Bristot de Oliveira <bristot@redhat.com> Signed-off-by: NSteven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Acked-by: NXie XiuQi <xiexiuqi@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Peter Ujfalusi 提交于
mainline inclusion from mainline-v5.14-rc3 commit 4afa0c22 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4O662 CVE: NA ------------------------------------------------- If driver_register() returns with error we need to free the memory allocated for auxdrv->driver.name before returning from __auxiliary_driver_register() Fixes: 7de3697e ("Add auxiliary bus support") Reviewed-by: NDan Williams <dan.j.williams@intel.com> Cc: stable <stable@vger.kernel.org> Signed-off-by: NPeter Ujfalusi <peter.ujfalusi@linux.intel.com> Link: https://lore.kernel.org/r/20210713093438.3173-1-peter.ujfalusi@linux.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dave Jiang 提交于
mainline inclusion from mainline-v5.13-rc1 commit bbf44abe category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4O662 CVE: NA ------------------------------------------------- Remove module bits in the auxiliary bus code since the auxiliary bus cannot be built as a module and the relevant code is not needed. Cc: Dave Ertman <david.m.ertman@intel.com> Suggested-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NDave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/161307488980.1896017.15627190714413338196.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dave Jiang 提交于
mainline inclusion from mainline-v5.12-rc1 commit 471b12c4 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4O662 CVE: NA ------------------------------------------------- When the auxiliary device code is built into the kernel, it can be executed before the auxiliary bus is registered. This causes bus->p to be not allocated and triggers a NULL pointer dereference when the auxiliary bus device gets added with bus_add_device(). Call the auxiliary_bus_init() under driver_init() so the bus is initialized before devices. Below is the kernel splat for the bug: [ 1.948215] BUG: kernel NULL pointer dereference, address: 0000000000000060 [ 1.950670] #PF: supervisor read access in kernel mode [ 1.950670] #PF: error_code(0x0000) - not-present page [ 1.950670] PGD 0 [ 1.950670] Oops: 0000 1 SMP NOPTI [ 1.950670] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.10.0-intel-nextsvmtest+ #2205 [ 1.950670] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 1.950670] RIP: 0010:bus_add_device+0x64/0x140 [ 1.950670] Code: 00 49 8b 75 20 48 89 df e8 59 a1 ff ff 41 89 c4 85 c0 75 7b 48 8b 53 50 48 85 d2 75 03 48 8b 13 49 8b 85 a0 00 00 00 48 89 de <48> 8 78 60 48 83 c7 18 e8 ef d9 a9 ff 41 89 c4 85 c0 75 45 48 8b [ 1.950670] RSP: 0000:ff46032ac001baf8 EFLAGS: 00010246 [ 1.950670] RAX: 0000000000000000 RBX: ff4597f7414aa680 RCX: 0000000000000000 [ 1.950670] RDX: ff4597f74142bbc0 RSI: ff4597f7414aa680 RDI: ff4597f7414aa680 [ 1.950670] RBP: ff46032ac001bb10 R08: 0000000000000044 R09: 0000000000000228 [ 1.950670] R10: ff4597f741141b30 R11: ff4597f740182a90 R12: 0000000000000000 [ 1.950670] R13: ffffffffa5e936c0 R14: 0000000000000000 R15: 0000000000000000 [ 1.950670] FS: 0000000000000000(0000) GS:ff4597f7bba00000(0000) knlGS:0000000000000000 [ 1.950670] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1.950670] CR2: 0000000000000060 CR3: 000000002140c001 CR4: 0000000000f71ef0 [ 1.950670] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 1.950670] DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 [ 1.950670] PKRU: 55555554 [ 1.950670] Call Trace: [ 1.950670] device_add+0x3ee/0x850 [ 1.950670] __auxiliary_device_add+0x47/0x60 [ 1.950670] idxd_pci_probe+0xf77/0x1180 [ 1.950670] local_pci_probe+0x4a/0x90 [ 1.950670] pci_device_probe+0xff/0x1b0 [ 1.950670] really_probe+0x1cf/0x440 [ 1.950670] ? rdinit_setup+0x31/0x31 [ 1.950670] driver_probe_device+0xe8/0x150 [ 1.950670] device_driver_attach+0x58/0x60 [ 1.950670] __driver_attach+0x8f/0x150 [ 1.950670] ? device_driver_attach+0x60/0x60 [ 1.950670] ? device_driver_attach+0x60/0x60 [ 1.950670] bus_for_each_dev+0x79/0xc0 [ 1.950670] ? kmem_cache_alloc_trace+0x323/0x430 [ 1.950670] driver_attach+0x1e/0x20 [ 1.950670] bus_add_driver+0x154/0x1f0 [ 1.950670] driver_register+0x70/0xc0 [ 1.950670] __pci_register_driver+0x54/0x60 [ 1.950670] idxd_init_module+0xe2/0xfc [ 1.950670] ? idma64_platform_driver_init+0x19/0x19 [ 1.950670] do_one_initcall+0x4a/0x1e0 [ 1.950670] kernel_init_freeable+0x1fc/0x25c [ 1.950670] ? rest_init+0xba/0xba [ 1.950670] kernel_init+0xe/0x116 [ 1.950670] ret_from_fork+0x1f/0x30 [ 1.950670] Modules linked in: [ 1.950670] CR2: 0000000000000060 [ 1.950670] --[ end trace cd7d1b226d3ca901 ]-- Fixes: 7de3697e ("Add auxiliary bus support") Reported-by: NJacob Pan <jacob.jun.pan@intel.com> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Acked-by: NDave Ertman <david.m.ertman@intel.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com> Link: https://lore.kernel.org/r/20210210201611.1611074-1-dave.jiang@intel.com Cc: stable <stable@vger.kernel.org> Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dave Jiang 提交于
mainline inclusion from mainline-v5.11-rc1 commit 784b2c48 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4O662 CVE: NA ------------------------------------------------- If the probe of the auxdrv failed, the device->driver is set to NULL. During kernel shutdown, the bus shutdown will call auxdrv->shutdown and cause an invalid ptr dereference. Add check to make sure device->driver is not NULL before we proceed. Fixes: 7de3697e ("Add auxiliary bus support") Cc: Dave Ertman <david.m.ertman@intel.com> Signed-off-by: NDave Jiang <dave.jiang@intel.com> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/160710040926.1889434.8840329810698403478.stgit@djiang5-desk3.ch.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Greg Kroah-Hartman 提交于
mainline inclusion from mainline-v5.11-rc1 commit 0d2bf11a category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4O662 CVE: NA ------------------------------------------------- For some reason, the original aux bus patch had some really long lines in a few places, probably due to it being a very long-lived patch in development by many different people. Fix that up so that the two files all have the same length lines and function formatting styles. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Ertman <david.m.ertman@intel.com> Cc: Fred Oh <fred.oh@linux.intel.com> Cc: Kiran Patil <kiran.patil@intel.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Martin Habets <mhabets@solarflare.com> Cc: Parav Pandit <parav@mellanox.com> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Cc: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Cc: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/X8oiSFTpYHw1xE/o@kroah.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Greg Kroah-Hartman 提交于
mainline inclusion from mainline-v5.11-rc1 commit 8142a46c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4O662 CVE: NA ------------------------------------------------- There's an effort to move the remove() callback in the driver core to not return an int, as nothing can be done if this function fails. To make that effort easier, make the aux bus remove function void to start with so that no users have to be changed sometime in the future. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Ertman <david.m.ertman@intel.com> Cc: Fred Oh <fred.oh@linux.intel.com> Cc: Kiran Patil <kiran.patil@intel.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Martin Habets <mhabets@solarflare.com> Cc: Parav Pandit <parav@mellanox.com> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Cc: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Cc: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/X8ohB1ks1NK7kPop@kroah.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Greg Kroah-Hartman 提交于
mainline inclusion from mainline-v5.11-rc1 commit 7bbb79ff category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4O662 CVE: NA ------------------------------------------------- No need to include slab.h in include/linux/auxiliary_bus.h, as it is not needed there. Move it to drivers/base/auxiliary.c instead. Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Ertman <david.m.ertman@intel.com> Cc: Fred Oh <fred.oh@linux.intel.com> Cc: Kiran Patil <kiran.patil@intel.com> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Martin Habets <mhabets@solarflare.com> Cc: Parav Pandit <parav@mellanox.com> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Cc: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Cc: Shiraz Saleem <shiraz.saleem@intel.com> Link: https://lore.kernel.org/r/X8og8xi3WkoYXet9@kroah.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Dave Ertman 提交于
mainline inclusion from mainline-v5.11-rc1 commit 7de3697e category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4O662 CVE: NA ------------------------------------------------- Add support for the Auxiliary Bus, auxiliary_device and auxiliary_driver. It enables drivers to create an auxiliary_device and bind an auxiliary_driver to it. The bus supports probe/remove shutdown and suspend/resume callbacks. Each auxiliary_device has a unique string based id; driver binds to an auxiliary_device based on this id through the bus. Co-developed-by: NKiran Patil <kiran.patil@intel.com> Co-developed-by: NRanjani Sridharan <ranjani.sridharan@linux.intel.com> Co-developed-by: NFred Oh <fred.oh@linux.intel.com> Co-developed-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NKiran Patil <kiran.patil@intel.com> Signed-off-by: NRanjani Sridharan <ranjani.sridharan@linux.intel.com> Signed-off-by: NFred Oh <fred.oh@linux.intel.com> Signed-off-by: NLeon Romanovsky <leonro@nvidia.com> Signed-off-by: NDave Ertman <david.m.ertman@intel.com> Reviewed-by: NPierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Reviewed-by: NShiraz Saleem <shiraz.saleem@intel.com> Reviewed-by: NParav Pandit <parav@mellanox.com> Reviewed-by: NDan Williams <dan.j.williams@intel.com> Reviewed-by: NMartin Habets <mhabets@solarflare.com> Link: https://lore.kernel.org/r/20201113161859.1775473-2-david.m.ertman@intel.comSigned-off-by: NDan Williams <dan.j.williams@intel.com> Link: https://lore.kernel.org/r/160695681289.505290.8978295443574440604.stgit@dwillia2-desk3.amr.corp.intel.comSigned-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: NYuanzheng Song <songyuanzheng@huawei.com> Reviewed-by: Kefeng Wang<wangkefeng.wang@huawei.com> Reviewed-by: NHanjun Guo <guohanjun@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lijun Fang 提交于
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA ------------------- Set CONFIG_HISI_SVM as m by default Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Lijun Fang 提交于
ascend inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4JMM0 CVE: NA ------------------- Change svm to modules by default. Remove get mem info functions, users can get the meminfo from procfs. Signed-off-by: NLijun Fang <fanglijun3@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Zhang Jian 提交于
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4K2U5 CVE: NA ------------------------------------------------- Enable the ascend oom control features for openeuler_defconfig default config. Signed-off-by: NZhang Jian <zhangjian210@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Weilong Chen 提交于
ascend inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4K2U5 CVE: NA ------------------------------------------------- Support disable oom-killer, and report oom events to bbox vm.enable_oom_killer: 0: disable oom killer 1: enable oom killer (default,compatible with mainline) Signed-off-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZhang Jian <zhangjian210@huawei.com> Reviewed-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kefeng Wang 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA Reference: https://lore.kernel.org/lkml/20211226083912.166512-4-wangkefeng.wang@huawei.com/t/ ------------------- This patch select HAVE_ARCH_HUGE_VMALLOC to let X86_64 and X86_PAE support huge vmalloc mappings, it is disabled by default, use hugevmalloc=on to enable it. Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kefeng Wang 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA Reference: https://lore.kernel.org/lkml/20211226083912.166512-4-wangkefeng.wang@huawei.com/t/ ------------------- This patch select HAVE_ARCH_HUGE_VMALLOC to let arm64 support huge vmalloc mappings, it is disabled by default, use hugevmalloc=on to enable it in some scenarios. Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Kefeng Wang 提交于
maillist inclusion category: Feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4NDAW CVE: NA Reference: https://lore.kernel.org/lkml/20211226083912.166512-4-wangkefeng.wang@huawei.com/t/ ------------------- Add HUGE_VMALLOC_DEFAULT_ENABLED to let user to choose whether or not enable huge vmalloc mappings by default, and this could make more architectures to enable huge vmalloc mappings feature but don't want to enable it by default. Add hugevmalloc=on/off parameter to enable or disable this feature at boot time, nohugevmalloc is still supported and equivalent to hugevmalloc=off. Signed-off-by: NKefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: NWang Wensheng <wangwensheng4@huawei.com> Reviewed-by: NWeilong Chen <chenweilong@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Li Zefan 提交于
euler inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4OPKC CVE: NA ------------------------------------------------- Signed-off-by: NPavel Emelyanov <xemul@parallels.com> Signed-off-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: Nluojiajun <luojiajun3@huawei.com> Reviewed-by: NLi Zefan <lizefan@huawei.com> Signed-off-by: NYang Yingliang <yangyingliang@huawei.com> Signed-off-by: NCui GaoSheng <cuigaosheng1@huawei.com> Reviewed-by: Nweiyang wang <wangweiyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: feature bugzilla: https://gitee.com/openeuler/kernel/issues/I4LL14 CVE: NA ------------------------------------------------- Different from Intel-RDT, MPAM need handle more cases when monitoring, there are two label PARTID and PMG embedded into one single data stream, they may work at the same time, or only PMG works, if only PMG works, the number of PMG determines the number of resources can be monitored at the same time. for instance(NR_PARTID equals to 2, NR_PMG equals to 2): (1) PARTID and PMG works together RMID = PARTID + PMG*NR_PARTID 0 0 0 1 1 0 2 0 1 3 1 1 (2) only PMG works RMID = PARTID + PMG*NR_PARTID 0 0 0 PARTID=1 makes no sense 0 1 0 1 0 1 PARTID=1 makes no sense 1 1 1 Given those reasons, we should take care the usage of rmid remap matrix, two fields ( @step_size: Step size from traversing the point of matrix once @step_cnt: Indicates how many times to traverse(.e.g if cdp;step_cnt=2) ) are added to struct rmid_transform for measuring allocation and realease of monitor resource(RMIDs). step_size is default set to 1, if only PMG(NR_PMG=4) works, makes it equals to number of columns, step_cnt means how many times are allocated and released each time, at this time rmid remap matrix looks like: ^ | ------column------> RMID 0 1 2 3 (step_size=1) `---' `--> (step_cnt=2 if cdp enabled) RMID 0 1 2 3 (step_size=1) `-- `--> (step_cnt=1 if cdp disabled) if PARTID(NR_PARTID=4) and PMG(NR_PMG=4) works together, at this time rmid remap matrix looks like: ------------row------------> | | RMID 0 1 2 3 (step_size=1) | `---' | `--> (step_cnt=2 if cdp enabled) | 4 5 6 7 | 8 9 10 11 v 12 13 14 15 In addition, it also supports step_size not equal to 1, cross-line traversal, but this scenario did not happen. Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4LMMF CVE: NA ------------------------------------------------- This adds tips when rmid modification failed. Fixes: a85aba6a ("mpam: Add support for group rmid modify") Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wang ShaoBo 提交于
hulk inclusion category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I3YAI3 CVE: NA ------------------------------------------------- The following error occurred occasionally on a machine that supports MPAM: [ 13.321386][ T658] Unable to handle kernel paging request at virtual address ffff80001115816c [ 13.326013][ T684] hid-generic 0003:12D1:0003.0002: input,hidraw1: USB HID v1.10 Mouse [Keyboard/Mouse KVM 1.1.0] on usb-0000:7a:01.0-1.1/input1 [ 13.340558][ T658] Mem abort info: [ 13.340563][ T658] ESR = 0x86000007 [ 13.352567][ T5] hub 6-1:1.0: USB hub found [ 13.364750][ T658] EC = 0x21: IABT (current EL), IL = 32 bits [ 13.369891][ T5] hub 6-1:1.0: 4 ports detected [ 13.373871][ T658] SET = 0, FnV = 0 [ 13.396107][ T658] EA = 0, S1PTW = 0 [ 13.400599][ T658] swapper pgtable: 64k pages, 48-bit VAs, pgdp=0000000029540000 [ 13.408726][ T658] [ffff80001115816c] pgd=0000205fffff0003, p4d=0000205fffff0003, pud=0000205fffff0003, pmd=0000205ffffe0003, pte=0000000000000000 [ 13.423346][ T658] Internal error: Oops: 86000007 [#1] SMP [ 13.429720][ T658] Modules linked in: [ 13.434243][ T658] CPU: 72 PID: 658 Comm: kworker/72:1 Not tainted 5.10.0-4.17.0.28.oe1.aarch64 #1 [ 13.443966][ T658] Hardware name: Huawei TaiShan 200 (Model 2280)/BC82AMDDA, BIOS 1.70 01/07/2021 [ 13.453683][ T658] Workqueue: events mpam_enable [ 13.459206][ T658] pstate: 20c00009 (nzCv daif +PAN +UAO -TCO BTYPE=--) [ 13.466625][ T658] pc : mpam_enable+0x194/0x1d8 [ 13.472019][ T658] lr : mpam_enable+0x194/0x1d8 [ 13.477301][ T658] sp : ffff80004664fd70 [ 13.481937][ T658] x29: ffff80004664fd70 x28: 0000000000000000 [ 13.488578][ T658] x27: ffff00400484a648 x26: ffff800011b71080 [ 13.495306][ T658] x25: 0000000000000000 x24: ffff800011b6cda0 [ 13.502001][ T658] x23: ffff800011646f18 x22: ffff800011b6cd80 [ 13.508684][ T658] x21: ffff800011b6c000 x20: ffff800011646f08 [ 13.515425][ T658] x19: ffff800011646f70 x18: 0000000000000020 [ 13.522075][ T658] x17: 000000001790b332 x16: 0000000000000001 [ 13.528785][ T658] x15: ffffffffffffffff x14: ff00000000000000 [ 13.535464][ T658] x13: ffffffffffffffff x12: 0000000000000006 [ 13.542045][ T658] x11: 00000091cea718e2 x10: 0000000000000b90 [ 13.548735][ T658] x9 : ffff80001009ebac x8 : ffff2040061aabf0 [ 13.555383][ T658] x7 : ffffa05f8dca0000 x6 : 000000000000000f [ 13.561924][ T658] x5 : 0000000000000000 x4 : ffff2040061aa000 [ 13.568613][ T658] x3 : ffff80001164dfa0 x2 : 00000000ffffffff [ 13.575267][ T658] x1 : ffffa05f8dca0000 x0 : 00000000000000c1 [ 13.581813][ T658] Call trace: [ 13.585600][ T658] mpam_enable+0x194/0x1d8 [ 13.590450][ T658] process_one_work+0x1cc/0x390 [ 13.595654][ T658] worker_thread+0x70/0x2f0 [ 13.600499][ T658] kthread+0x118/0x120 [ 13.604935][ T658] ret_from_fork+0x10/0x18 [ 13.609717][ T658] Code: bad PC value [ 13.613944][ T658] ---[ end trace f1e305d2c339f67f ]--- [ 13.753818][ T658] Kernel panic - not syncing: Oops: Fatal exception [ 13.760885][ T658] SMP: stopping secondary CPUs [ 13.765933][ T658] Kernel Offset: disabled [ 13.770516][ T658] CPU features: 0x8040002,22208a38 [ 13.775862][ T658] Memory Limit: none [ 13.913929][ T658] ---[ end Kernel panic - not syncing: The process of MPAM devices initialization is like this: mpam_discovery_start() ... // discover devices mpam_discovery_complete() // hang up the mpam_online/offline_cpu callbacks -=> mpam_cpu_online() // probe all devices -=> mpam_enable() // prepare for resctrl (1) -=> cpuhp_remove_state() // clean resctrl internal structure (2) -=> cpuhp_setup_state() // rehang mpam_online/offline_cpu callbacks -=> mpam_cpu_online() // it does not call mpam_enable again -=> mpam_resctrl_cpu_online() // pull up resctrl Re-hang process of mpam_cpu_online/offline callbacks should not be disturbed by irqs, to ensure that CPU context is reliable before re-entering mpam_cpu_online(), which always happens between (1) and (2). Fixes: 2ab89c89 ("arm64/mpam: resctrl: Re-synchronise resctrl's view of online CPUs") Signed-off-by: NWang ShaoBo <bobo.shaobowang@huawei.com> Reviewed-by: NXiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Xingang Wang 提交于
stable inclusion category: feature from stable-5.13-rc1 commit 18d73124 bugzilla: https://gitee.com/openeuler/kernel/issues/I4NR4D Reference: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=18d731242d5c67c0783126c42d3f85870cec2df5 ------------------------------------------------- This can fail, and seems to be a popular target for syzkaller error injection. Check the error return and unwind with put_device(). Fixes: 7b96953b ("vfio: Mediated device Core driver") Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Message-Id: <9-v2-d36939638fc6+d54-vfio2_jgg@nvidia.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NXingang Wang <wangxingang5@huawei.com> Reviewed-by: NXu Xiaoyang <xuxiaoyang2@huawei.com> Reviewed-by: NChristoph Hellwig <hch@lst.de> Reviewed-by: NKevin Tian <kevin.tian@intel.com> Reviewed-by: NMax Gurtovoy <mgurtovoy@nvidia.com> Reviewed-by: NCornelia Huck <cohuck@redhat.com> Signed-off-by: NJason Gunthorpe <jgg@nvidia.com> Signed-off-by: NAlex Williamson <alex.williamson@redhat.com> Signed-off-by: NXingang Wang <wangxingang5@huawei.com> Reviewed-by: NXu Xiaoyang <xuxiaoyang2@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Marc Zyngier 提交于
mainline inclusion from mainline-v5.14-rc1 commit d0c94c49 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d0c94c49792cf780cbfefe29f81bb8c3b73bc76b ------------------- Restoring a guest with an active virtual PMU results in no perf counters being instanciated on the host side. Not quite what you'd expect from a restore. In order to fix this, force a writeback of PMCR_EL0 on the first run of a vcpu (using a new request so that it happens once the vcpu has been loaded). This will in turn create all the host-side counters that were missing. Reported-by: NJinank Jain <jinankj@amazon.de> Tested-by: NJinank Jain <jinankj@amazon.de> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/87wnrbylxv.wl-maz@kernel.org Link: https://lore.kernel.org/r/b53dfcf9bbc4db7f96154b1cd5188d72b9766358.camel@amazon.deSigned-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Alexandru Elisei 提交于
mainline inclusion from mainline-v5.11-rc1 commit 9bbfa4b5 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9bbfa4b565379eeb2fb8fdbcc9979549ae0e48d9 ------------------- When enabling the PMU in kvm_arm_pmu_v3_enable(), KVM returns early if the PMU flag created is false and skips any other checks. Because PMU emulation is gated only on the VCPU feature being set, this makes it possible for userspace to get away with setting the VCPU feature but not doing any initialization for the PMU. Fix it by returning an error when trying to run the VCPU if the PMU hasn't been initialized correctly. The PMU is marked as created only if the interrupt ID has been set when using an in-kernel irqchip. This means the same check in kvm_arm_pmu_v3_enable() is redundant, remove it. Signed-off-by: NAlexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20201126144916.164075-1-alexandru.elisei@arm.comSigned-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Marc Zyngier 提交于
mainline inclusion from mainline-v5.11-rc1 commit 14bda7a9 bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=14bda7a927336055d7c0deb1483f9cdb687c2080 ------------------- There are a number of places where we check for the KVM_ARM_VCPU_PMU_V3 feature. Wrap this check into a new kvm_vcpu_has_pmu(), and use it at the existing locations. No functional change. Reviewed-by: NAlexandru Elisei <alexandru.elisei@arm.com> Signed-off-by: NMarc Zyngier <maz@kernel.org> Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wanpeng Li 提交于
mainline inclusion from mainline-v5.14-rc1 commit 2735886c category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2735886c9ef115fc7b40d27bfe73605c38e9d56b ------------------- KVM_GET_LAPIC stores the current value of TMCCT and KVM_SET_LAPIC's memcpy stores it in vcpu->arch.apic->regs, KVM_SET_LAPIC could store zero in vcpu->arch.apic->regs after it uses it, and then the stored value would always be zero. In addition, the TMCCT is always computed on-demand and never directly readable. Suggested-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1623223000-18116-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Sean Christopherson 提交于
mainline inclusion from mainline-v5.14-rc1 commit 0aa18375 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=0aa1837533e5f4be8cc21bbc06314c23ba2c5447 ------------------- Reset the MMU context at vCPU INIT (and RESET for good measure) if CR0.PG was set prior to INIT. Simply re-initializing the current MMU is not sufficient as the current root HPA may not be usable in the new context. E.g. if TDP is disabled and INIT arrives while the vCPU is in long mode, KVM will fail to switch to the 32-bit pae_root and bomb on the next VM-Enter due to running with a 64-bit CR3 in 32-bit mode. This bug was papered over in both VMX and SVM, but still managed to rear its head in the MMU role on VMX. Because EFER.LMA=1 requires CR0.PG=1, kvm_calc_shadow_mmu_root_page_role() checks for EFER.LMA without first checking CR0.PG. VMX's RESET/INIT flow writes CR0 before EFER, and so an INIT with the vCPU in 64-bit mode will cause the hack-a-fix to generate the wrong MMU role. In VMX, the INIT issue is specific to running without unrestricted guest since unrestricted guest is available if and only if EPT is enabled. Commit 8668a3c4 ("KVM: VMX: Reset mmu context when entering real mode") resolved the issue by forcing a reset when entering emulated real mode. In SVM, commit ebae871a ("kvm: svm: reset mmu on VCPU reset") forced a MMU reset on every INIT to workaround the flaw in common x86. Note, at the time the bug was fixed, the SVM problem was exacerbated by a complete lack of a CR4 update. The vendor resets will be reverted in future patches, primarily to aid bisection in case there are non-INIT flows that rely on the existing VMX logic. Because CR0.PG is unconditionally cleared on INIT, and because CR0.WP and all CR4/EFER paging bits are ignored if CR0.PG=0, simply checking that CR0.PG was '1' prior to INIT/RESET is sufficient to detect a required MMU context reset. Cc: stable@vger.kernel.org Signed-off-by: NSean Christopherson <seanjc@google.com> Message-Id: <20210622175739.3610207-4-seanjc@google.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-
由 Wanpeng Li 提交于
mainline inclusion from mainline-v5.13-rc6 commit e898da78 category: bugfix bugzilla: https://gitee.com/openeuler/kernel/issues/I4NP0K CVE: NA Reference: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e898da784aed0ea65f7672d941c01dc9b79e6299 ------------------- According to the SDM 10.5.4.1: A write of 0 to the initial-count register effectively stops the local APIC timer, in both one-shot and periodic mode. However, the lapic timer oneshot/periodic mode which is emulated by vmx-preemption timer doesn't stop by writing 0 to TMICT since vmx->hv_deadline_tsc is still programmed and the guest will receive the spurious timer interrupt later. This patch fixes it by also cancelling the vmx-preemption timer when writing 0 to the initial-count register. Reviewed-by: NSean Christopherson <seanjc@google.com> Signed-off-by: NWanpeng Li <wanpengli@tencent.com> Message-Id: <1623050385-100988-1-git-send-email-wanpengli@tencent.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NJingyi Wang <wangjingyi11@huawei.com> Reviewed-by: NZenghui Yu <yuzenghui@huawei.com> Reviewed-by: NWei Li <liwei391@huawei.com> Signed-off-by: NZheng Zengkai <zhengzengkai@huawei.com>
-