- 02 8月, 2017 2 次提交
-
-
由 Vikas Shivappa 提交于
We currently have a CONFIG_RDT_A which is for RDT(Resource directory technology) allocation based resctrl filesystem interface. As a preparation to add support for RDT monitoring as well into the same resctrl filesystem, change the config option to be CONFIG_RDT which would include both RDT allocation and monitoring code. No functional change. Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-4-git-send-email-vikas.shivappa@linux.intel.com
-
由 Vikas Shivappa 提交于
'perf cqm' never worked due to the incompatibility between perf infrastructure and cqm hardware support. The hardware uses RMIDs to track the llc occupancy of tasks and these RMIDs are per package. This makes monitoring a hierarchy like cgroup along with monitoring of tasks separately difficult and several patches sent to lkml to fix them were NACKed. Further more, the following issues in the current perf cqm make it almost unusable: 1. No support to monitor the same group of tasks for which we do allocation using resctrl. 2. It gives random and inaccurate data (mostly 0s) once we run out of RMIDs due to issues in Recycling. 3. Recycling results in inaccuracy of data because we cannot guarantee that the RMID was stolen from a task when it was not pulling data into cache or even when it pulled the least data. Also for monitoring llc_occupancy, if we stop using an RMID_x and then start using an RMID_y after we reclaim an RMID from an other event, we miss accounting all the occupancy that was tagged to RMID_x at a later perf_count. 2. Recycling code makes the monitoring code complex including scheduling because the event can lose RMID any time. Since MBM counters count bandwidth for a period of time by taking snap shot of total bytes at two different times, recycling complicates the way we count MBM in a hierarchy. Also we need a spin lock while we do the processing to account for MBM counter overflow. We also currently use a spin lock in scheduling to prevent the RMID from being taken away. 4. Lack of support when we run different kind of event like task, system-wide and cgroup events together. Data mostly prints 0s. This is also because we can have only one RMID tied to a cpu as defined by the cqm hardware but a perf can at the same time tie multiple events during one sched_in. 5. No support of monitoring a group of tasks. There is partial support for cgroup but it does not work once there is a hierarchy of cgroups or if we want to monitor a task in a cgroup and the cgroup itself. 6. No support for monitoring tasks for the lifetime without perf overhead. 7. It reported the aggregate cache occupancy or memory bandwidth over all sockets. But most cloud and VMM based use cases want to know the individual per-socket usage. Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: ravi.v.shankar@intel.com Cc: tony.luck@intel.com Cc: fenghua.yu@intel.com Cc: peterz@infradead.org Cc: eranian@google.com Cc: vikas.shivappa@intel.com Cc: ak@linux.intel.com Cc: davidcc@google.com Cc: reinette.chatre@intel.com Link: http://lkml.kernel.org/r/1501017287-28083-2-git-send-email-vikas.shivappa@linux.intel.com
-
- 28 7月, 2017 1 次提交
-
-
由 Matthias Kaehlcke 提交于
The clang warning 'address-of-packed-member' is disabled for the general kernel code, also disable it for the x86 boot code. This suppresses a bunch of warnings like this when building with clang: ./arch/x86/include/asm/processor.h:535:30: warning: taking address of packed member 'sp0' of class or structure 'x86_hw_tss' may result in an unaligned pointer value [-Waddress-of-packed-member] return this_cpu_read_stable(cpu_tss.x86_tss.sp0); ^~~~~~~~~~~~~~~~~~~ ./arch/x86/include/asm/percpu.h:391:59: note: expanded from macro 'this_cpu_read_stable' #define this_cpu_read_stable(var) percpu_stable_op("mov", var) ^~~ ./arch/x86/include/asm/percpu.h:228:16: note: expanded from macro 'percpu_stable_op' : "p" (&(var))); ^~~ Signed-off-by: NMatthias Kaehlcke <mka@chromium.org> Cc: Doug Anderson <dianders@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Masahiro Yamada <yamada.masahiro@socionext.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170725215053.135586-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 27 7月, 2017 5 次提交
-
-
由 Wanpeng Li 提交于
Preempt can occur in the preemption timer expiration handler: CPU0 CPU1 preemption timer vmexit handle_preemption_timer(vCPU0) kvm_lapic_expired_hv_timer hv_timer_is_use == true sched_out sched_in kvm_arch_vcpu_load kvm_lapic_restart_hv_timer restart_apic_timer start_hv_timer already-expired timer or sw timer triggerd in the window start_sw_timer cancel_hv_timer /* back in kvm_lapic_expired_hv_timer */ cancel_hv_timer WARN_ON(!apic->lapic_timer.hv_timer_in_use); ==> Oops This can be reproduced if CONFIG_PREEMPT is enabled. ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G OE 4.13.0-rc2+ #16 RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm] Call Trace: handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 ------------[ cut here ]------------ WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm] CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G W OE 4.13.0-rc2+ #16 RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm] Call Trace: kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm] handle_preemption_timer+0xe/0x20 [kvm_intel] vmx_handle_exit+0xb8/0xd70 [kvm_intel] kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm] ? kvm_arch_vcpu_load+0x47/0x230 [kvm] ? kvm_arch_vcpu_load+0x62/0x230 [kvm] kvm_vcpu_ioctl+0x340/0x700 [kvm] ? kvm_vcpu_ioctl+0x340/0x700 [kvm] ? __fget+0xfc/0x210 do_vfs_ioctl+0xa4/0x6a0 ? __fget+0x11d/0x210 SyS_ioctl+0x79/0x90 do_syscall_64+0x81/0x220 entry_SYSCALL64_slow_path+0x25/0x25 This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer and start_sw_timer be in preemption-disabled regions, which trivially avoid any reentrancy issue with preempt notifier. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> [Add more WARNs. - Paolo] Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wanpeng Li 提交于
Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1: Before NMI IRET test Sending NMI to self NMI isr running stack 0x461000 Sending nested NMI to self After nested NMI to self Nested NMI isr running rip=40038e After iret After NMI to self FAIL: NMI Commit 4c4a6f79 (KVM: nVMX: track NMI blocking state separately for each VMCS) tracks NMI blocking state separately for vmcs01 and vmcs02. However it is not enough: - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault on IRET, so the L2 can generate #PF which can be intercepted by L0. - L0 walks L1's guest page table and sees the mapping is invalid, it resumes the L1 guest and injects the #PF into L1. At this point the vmcs02 has nmi_known_unmasked=true. - L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field of vmcs12 (and fixes the shadow page table) before resuming L2 guest. - L1 executes VMRESUME to resume L2, causing a vmexit to L0 - during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the interruptibility-state field of vmcs02, but nmi_known_unmasked is still true. - L2 immediately exits to L0 with another page fault, because L0 still has not updated the NGVA->HPA page tables. However, nmi_known_unmasked is true so vmx_recover_nmi_blocking does not do anything. The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
The PI vector for L0 and L1 must be different. If dest vcpu0 is in guest mode while vcpu1 is delivering a non-nested PI to vcpu0, there wont't be any vmexit so that the non-nested interrupt will be delayed. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Wincy Van 提交于
We are using the same vector for nested/non-nested posted interrupts delivery, this may cause interrupts latency in L1 since we can't kick the L2 vcpu out of vmx-nonroot mode. This patch introduces a new vector which is only for nested posted interrupts to solve the problems above. Signed-off-by: NWincy Van <fanwenyi0529@gmail.com> Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
由 Paolo Bonzini 提交于
This reverts the change of commit f85c758d, as the behavior it modified was intended. The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual says: Table 4-7. Use of CR3 with PAE Paging Bit Position(s) Contents 4:0 Ignored 31:5 Physical address of the 32-Byte aligned page-directory-pointer table used for linear-address translation 63:32 Ignored (these bits exist only on processors supporting the Intel-64 architecture) To placate the static checker, write the mask explicitly as an unsigned long constant instead of using a 32-bit unsigned constant. Cc: Dan Carpenter <dan.carpenter@oracle.com> Fixes: f85c758dSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 25 7月, 2017 2 次提交
-
-
由 Stefan Assmann 提交于
When EFI runtime services are disabled, for example by the "noefi" kernel cmdline parameter, the reboot_type could still be set to BOOT_EFI causing reboot to fail. Fix this by checking if EFI runtime services are enabled. Signed-off-by: NStefan Assmann <sassmann@kpanic.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724122248.24006-1-sassmann@kpanic.de [ Fixed 'not disabled' double negation. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Michael Davidson 提交于
undef memcpy() and friends in boot/string.c so that the functions defined here will have the correct names, otherwise we end up up trying to redefine __builtin_memcpy() etc. Surprisingly, GCC allows this (and, helpfully, discards the __builtin_ prefix from the function name when compiling it), but clang does not. Adding these #undef's appears to preserve what I assume was the original intent of the code. Signed-off-by: NMichael Davidson <md@google.com> Signed-off-by: NMatthias Kaehlcke <mka@chromium.org> Acked-by: NH. Peter Anvin <hpa@zytor.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bernhard.Rosenkranzer@linaro.org Cc: Greg Hackmann <ghackmann@google.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170724235155.79255-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
- 24 7月, 2017 8 次提交
-
-
由 Masami Hiramatsu 提交于
The following commit: 003002e0 ("kprobes: Fix arch_prepare_kprobe to handle copy insn failures") returns an error if the copying of the instruction, but does not release the allocated insn_slot. Clean up correctly. Signed-off-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Ananth N Mavinakayanahalli <ananth@in.ibm.com> Cc: Anil S Keshavamurthy <anil.s.keshavamurthy@intel.com> Cc: David S . Miller <davem@davemloft.net> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 003002e0 ("kprobes: Fix arch_prepare_kprobe to handle copy insn failures") Link: http://lkml.kernel.org/r/150064834183.6172.11694375818447664416.stgit@devboxSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Stephane Eranian 提交于
This skx_uncore_cha_extra_regs array was missing an end-marker. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-7-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Stephane Eranian 提交于
This patch adds two missing event extra regs for Skylake Server CHA PMU: - TOR_INSERTS - TOR_OCCUPANCY Were missing support for all the filters, including opcode matchers. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-6-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
There is no field c6 and link for CHA BOX FILTER. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-5-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
Correct the umask for LLC_LOOKUP.LOCAL and LLC_LOOKUP.REMOTE events Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-4-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Kan Liang 提交于
PCU event format for SKX are different from snbep. Introduce a new format group for SKX PCU. Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Stephane Eranian <eranian@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-3-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Stephane Eranian 提交于
This patch fixes the event_mask and event_ext_mask for the Intel Skylake Server UPI PMU. Bit 21 is not used as a filter. The extended umask is from bit 32 to bit 55. Correct both umasks. Signed-off-by: NStephane Eranian <eranian@google.com> Signed-off-by: NKan Liang <kan.liang@intel.com> Acked-by: NPeter Zijlstra <peterz@infradead.org> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Jiri Olsa <jolsa@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vince Weaver <vincent.weaver@maine.edu> Link: http://lkml.kernel.org/r/1499967350-10385-2-git-send-email-kan.liang@intel.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Paolo Bonzini 提交于
Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
-
- 23 7月, 2017 2 次提交
-
-
由 Juergen Gross 提交于
Commit dc6416f1 ("xen/x86: Call cpu_startup_entry(CPUHP_AP_ONLINE_IDLE) from xen_play_dead()") introduced an error leading to a stack overflow of the idle task when a cpu was brought offline/online many times: by calling cpu_startup_entry() instead of returning at the end of xen_play_dead() do_idle() would be entered again and again. Don't use cpu_startup_entry(), but cpuhp_online_idle() instead allowing to return from xen_play_dead(). Cc: <stable@vger.kernel.org> # 4.12 Signed-off-by: NJuergen Gross <jgross@suse.com> Reviewed-by: NBoris Ostrovsky <boris.ostrovsky@oracle.com> Signed-off-by: NJuergen Gross <jgross@suse.com>
-
由 Vitaly Kuznetsov 提交于
CONFIG_BOOTPARAM_HOTPLUG_CPU0 allows to offline CPU0 but Xen HVM guests BUG() in xen_teardown_timer(). Remove the BUG_ON(), this is probably a leftover from ancient times when CPU0 hotplug was impossible, it works just fine for HVM. Signed-off-by: NVitaly Kuznetsov <vkuznets@redhat.com> Acked-by: NJuergen Gross <jgross@suse.com> Signed-off-by: NJuergen Gross <jgross@suse.com>
-
- 21 7月, 2017 4 次提交
-
-
由 Rob Herring 提交于
Now that we have a custom printf format specifier, convert users of full_name to use %pOF instead. This is preparation to remove storing of the full path string for each device node. Signed-off-by: NRob Herring <robh@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: devicetree@vger.kernel.org Link: http://lkml.kernel.org/r/20170718214339.7774-7-robh@kernel.org [ Clarify the error message while at it, as 'node' is ambiguous. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Jiri Olsa 提交于
We have 2 functions using the same sched_task callback: - PEBS drain for free running counters - LBR save/store Both of them are called from intel_pmu_sched_task() and either of them can be unwillingly triggered when the other one is configured to run. Let's say there's PEBS drain configured in sched_task callback for the event, but in the callback itself (intel_pmu_sched_task()) we will also run the code for LBR save/restore, which we did not ask for, but the code in intel_pmu_sched_task() does not check for that. This can lead to extra cycles in some perf monitoring, like when we monitor PEBS event without LBR data. # perf record --no-timestamp -c 10000 -e cycles:p ./perf bench sched pipe -l 1000000 (We need PEBS, non freq/non timestamp event to enable the sched_task callback) The perf stat of cycles and msr:write_msr for above command before the change: ... Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \ ./perf bench sched pipe -l 1000000' (5 runs): 18,519,557,441 cycles:k 91,195,527 msr:write_msr 29.334476406 seconds time elapsed And after the change: ... Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \ ./perf bench sched pipe -l 1000000' (5 runs): 18,704,973,540 cycles:k 27,184,720 msr:write_msr 16.977875900 seconds time elapsed There's no affect on cycles:k because the sched_task happens with events switched off, however the msr:write_msr tracepoint counter together with almost 50% of time speedup show the improvement. Monitoring LBR event and having extra PEBS drain processing in sched_task callback showed just a little speedup, because the drain function does not do much extra work in case there is no PEBS data. Adding conditions to recognize the configured work that needs to be done in the x86_pmu's sched_task callback. Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NJiri Olsa <jolsa@kernel.org> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Kan Liang <kan.liang@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Link: http://lkml.kernel.org/r/20170719075247.GA27506@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Andrew Banman 提交于
The BAU confers no benefit to a UV system running with only one hub/socket. Permanently disable the BAU driver if there are less than two hubs online to avoid BAU overhead. We have observed failed boots on single-socket UV4 systems caused by BAU that are avoided with this patch. Also, while at it, consolidate initialization error blocks and fix a memory leak. Signed-off-by: NAndrew Banman <abanman@hpe.com> Acked-by: NRuss Anderson <rja@hpe.com> Acked-by: NMike Travis <mike.travis@hpe.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: tony.ernst@hpe.com Link: http://lkml.kernel.org/r/1500588351-78016-1-git-send-email-abanman@hpe.com [ Minor cleanups. ] Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Linus Torvalds 提交于
They really are, and the "take the address of a single character" makes the string fortification code unhappy (it believes that you can now only acccess one byte, rather than a byte range, and then raises errors for the memory copies going on in there). We could now remove a few 'addressof' operators (since arrays naturally degrade to pointers), but this is the minimal patch that just changes the C prototypes of those template arrays (the templates themselves are defined in inline asm). Reported-by: Nkernel test robot <xiaolong.ye@intel.com> Acked-and-tested-by: NMasami Hiramatsu <mhiramat@kernel.org> Cc: Daniel Micay <danielmicay@gmail.com> Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
-
- 20 7月, 2017 13 次提交
-
-
由 Roman Kagan 提交于
If the SynIC timer message delivery fails due to SINT message slot being busy, there's no point to attempt starting the timer again until we're notified of the slot being released by the guest (via EOM or EOI). Even worse, when a oneshot timer fails to deliver its message, its re-arming with an expiration time in the past leads to immediate retry of the delivery, and so on, without ever letting the guest vcpu to run and release the slot, which results in a livelock. To avoid that, only start the timer when there's no timer message pending delivery. When there is, meaning the slot is busy, the processing will be restarted upon notification from the guest that the slot is released. Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Wanpeng Li 提交于
This can be reproduced by EPT=1, unrestricted_guest=N, emulate_invalid_state=Y or EPT=0, the trace of kvm-unit-tests/taskswitch2.flat is like below, it tries to emulate invalid guest state task-switch: kvm_exit: reason TASK_SWITCH rip 0x0 info 40000058 0 kvm_emulate_insn: 42000:0:0f 0b (0x2) kvm_emulate_insn: 42000:0:0f 0b (0x2) failed kvm_inj_exception: #UD (0x0) kvm_entry: vcpu 0 kvm_exit: reason TASK_SWITCH rip 0x0 info 40000058 0 kvm_emulate_insn: 42000:0:0f 0b (0x2) kvm_emulate_insn: 42000:0:0f 0b (0x2) failed kvm_inj_exception: #UD (0x0) ...................... It appears that the task-switch emulation updates rflags (and vm86 flag) only after the segments are loaded, causing vmx->emulation_required to be set, when in fact invalid guest state emulation is not needed. This patch fixes it by updating vmx->emulation_required after the rflags (and vm86 flag) is updated in task-switch emulation. Thanks Radim for moving the update to vmx__set_flags and adding Paolo's suggestion for the check. Suggested-by: NNadav Amit <nadav.amit@gmail.com> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Radim Krčmář <rkrcmar@redhat.com> Cc: Nadav Amit <nadav.amit@gmail.com> Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Josh Poimboeuf 提交于
Mike Galbraith reported a situation where a WARN_ON_ONCE() call in DRM code turned into an oops. As it turns out, WARN_ON_ONCE() seems to be completely broken when called from a module. The bug was introduced with the following commit: 19d43626 ("debug: Add _ONCE() logic to report_bug()") That commit changed WARN_ON_ONCE() to move its 'once' logic into the bug trap handler. It requires a writable bug table so that the BUGFLAG_DONE bit can be written to the flags to indicate the first warning has occurred. The bug table was made writable for vmlinux, which relies on vmlinux.lds.S and vmlinux.lds.h for laying out the sections. However, it wasn't made writable for modules, which rely on the ELF section header flags. Reported-by: NMike Galbraith <efault@gmx.de> Tested-by: NMasami Hiramatsu <mhiramat@kernel.org> Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com> Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Fixes: 19d43626 ("debug: Add _ONCE() logic to report_bug()") Link: http://lkml.kernel.org/r/a53b04235a65478dd9afc51f5b329fdc65c84364.1500095401.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Arnd Bergmann 提交于
We have space for exactly three characters for the index in "max7315_%d_base", but as GCC points out having more would cause an string overflow: arch/x86/platform/intel-mid/device_libs/platform_max7315.c: In function 'max7315_platform_data': arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: error: '%d' directive writing between 1 and 11 bytes into a region of size 9 [-Werror=format-overflow=] sprintf(base_pin_name, "max7315_%d_base", nr); ^~~~~~~~~~~~~~~~~ arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: note: directive argument in the range [-2147483647, 2147483647] arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:3: note: 'sprintf' output between 15 and 25 bytes into a destination of size 17 sprintf(base_pin_name, "max7315_%d_base", nr); This makes it use an snprintf() to truncate the string if that happened rather than overflowing the stack. In practice, this is safe, because there won't be a large number of max7315 devices in the systems, and both the format and the length are defined by the firmware interface. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-9-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Arnd Bergmann 提交于
The IOSF_MBI option requires PCI support, without it we get a harmless Kconfig warning when it gets selected by PUNIT_ATOM_DEBUG: warning: (X86_INTEL_LPSS && SND_SST_IPC_ACPI && MMC_SDHCI_ACPI && PUNIT_ATOM_DEBUG) selects IOSF_MBI which has unmet direct dependencies (PCI) This adds another dependency to avoid the warning. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Andy Lutomirski <luto@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-8-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Arnd Bergmann 提交于
Every kernel build on x86 will result in some output: Setup is 13084 bytes (padded to 13312 bytes). System is 4833 kB CRC 6d35fa35 Kernel: arch/x86/boot/bzImage is ready (#2) This shuts it up, so that 'make -s' is truely silent as long as everything works. Building without '-s' should produce unchanged output. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Matt Fleming <matt@codeblueprint.co.uk> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-6-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Arnd Bergmann 提交于
The x86 version of insb/insw/insl uses an inline assembly that does not have the target buffer listed as an output. This can confuse the compiler, leading it to think that a subsequent access of the buffer is uninitialized: drivers/net/wireless/wl3501_cs.c: In function ‘wl3501_mgmt_scan_confirm’: drivers/net/wireless/wl3501_cs.c:665:9: error: ‘sig.status’ is used uninitialized in this function [-Werror=uninitialized] drivers/net/wireless/wl3501_cs.c:668:12: error: ‘sig.cap_info’ may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/sb1000.c: In function 'sb1000_rx': drivers/net/sb1000.c:775:9: error: 'st[0]' is used uninitialized in this function [-Werror=uninitialized] drivers/net/sb1000.c:776:10: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized] drivers/net/sb1000.c:784:11: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized] I tried to mark the exact input buffer as an output here, but couldn't figure it out. As suggested by Linus, marking all memory as clobbered however is good enough too. For the outs operations, I also add the memory clobber, to force the input to be written to local variables. This is probably already guaranteed by the "asm volatile", but it can't hurt to do this for symmetry. Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Acked-by: NLinus Torvalds <torvalds@linux-foundation.org> Cc: Borislav Petkov <bp@suse.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tom Lendacky <thomas.lendacky@amd.com> Link: http://lkml.kernel.org/r/20170719125310.2487451-5-arnd@arndb.de Link: https://lkml.org/lkml/2017/7/12/605Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Arnd Bergmann 提交于
gcc-7.1.1 produces this warning: arch/x86/math-emu/reg_add_sub.c: In function 'FPU_add': arch/x86/math-emu/reg_add_sub.c:80:48: error: ?: using integer constants in boolean context [-Werror=int-in-bool-context] This appears to be a bug in gcc-7.1.1, and I have reported it as PR81484. The compiler suggests that code written as if (a & b ? c : d) is usually incorrect and should have been if (a & (b ? c : d)) However, in this case, we correctly write if ((a & b) ? c : d) and should not get a warning for it. This adds a dirty workaround for the problem, adding a comparison with zero inside of the macro. The warning is currently disabled in the kernel, so we may decide not to apply the patch, and instead wait for future gcc releases to fix the problem. On the other hand, it seems to be the only instance of this particular problem. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Bill Metzenthen <billm@melbpc.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-4-arnd@arndb.de Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Arnd Bergmann 提交于
When building the kernel with "make EXTRA_CFLAGS=...", this overrides the "PARANOID" preprocessor macro defined in arch/x86/math-emu/Makefile, and we run into a build warning: arch/x86/math-emu/reg_compare.c: In function ‘compare_i_st_st’: arch/x86/math-emu/reg_compare.c:254:6: error: ‘f’ may be used uninitialized in this function [-Werror=maybe-uninitialized] This fixes the implementation to work correctly even without the PARANOID flag, and also fixes the Makefile to not use the EXTRA_CFLAGS variable but instead use the ccflags-y variable in the Makefile that is meant for this purpose. Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Bill Metzenthen <billm@melbpc.org.au> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-3-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Arnd Bergmann 提交于
The intialization function checks for various failure scenarios, but unfortunately the compiler gets a little confused about the possible combinations, leading to a false-positive build warning when -Wmaybe-uninitialized is set: arch/x86/events/core.c: In function ‘init_hw_perf_events’: arch/x86/events/core.c:264:3: warning: ‘reg_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized] arch/x86/events/core.c:264:3: warning: ‘val_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized] pr_err(FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", We can't actually run into this case, so this shuts up the warning by initializing the variables to a known-invalid state. Suggested-by: NPeter Zijlstra <peterz@infradead.org> Signed-off-by: NArnd Bergmann <arnd@arndb.de> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/20170719125310.2487451-2-arnd@arndb.de Link: https://patchwork.kernel.org/patch/9392595/Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Krzysztof Kozlowski 提交于
Remove old, dead Kconfig options (in order appearing in this commit): - EXPERIMENTAL is gone since v3.9; - IP_NF_TARGET_ULOG: commit d4da843e ("netfilter: kill remnants of ulog targets"); - USB_LIBUSUAL: commit f61870ee ("usb: remove libusual"); Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/1500526885-4341-1-git-send-email-krzk@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Seunghun Han 提交于
One of the rarely executed code pathes in check_timer() calls unmask_ioapic_irq() passing irq_get_chip_data(0) as argument. That's wrong as unmask_ioapic_irq() expects a pointer to the irq data of interrupt 0. irq_get_chip_data(0) returns NULL, so the following dereference in unmask_ioapic_irq() causes a kernel panic. The issue went unnoticed in the first place because irq_get_chip_data() returns a void pointer so the compiler cannot do a type check on the argument. The code path was added for machines with broken configuration, but it seems that those machines are either not running current kernels or simply do not longer exist. Hand in irq_get_irq_data(0) as argument which provides the correct data. [ tglx: Rewrote changelog ] Fixes: 4467715a ("x86/irq: Move irq_cfg.irq_2_pin into io_apic.c") Signed-off-by: NSeunghun Han <kkamagui@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1500369644-45767-1-git-send-email-kkamagui@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
-
由 Seunghun Han 提交于
The bus_irq argument of mp_override_legacy_irq() is used as the index into the isa_irq_to_gsi[] array. The bus_irq argument originates from ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI tables, but is nowhere sanity checked. That allows broken or malicious ACPI tables to overwrite memory, which might cause malfunction, panic or arbitrary code execution. Add a sanity check and emit a warning when that triggers. [ tglx: Added warning and rewrote changelog ] Signed-off-by: NSeunghun Han <kkamagui@gmail.com> Signed-off-by: NThomas Gleixner <tglx@linutronix.de> Cc: security@kernel.org Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Cc: stable@vger.kernel.org Signed-off-by: NIngo Molnar <mingo@kernel.org>
-
- 19 7月, 2017 3 次提交
-
-
由 Arnd Bergmann 提交于
KVM tries to select 'TASKSTATS', which had additional dependencies: warning: (KVM) selects TASKSTATS which has unmet direct dependencies (NET && MULTIUSER) Signed-off-by: NArnd Bergmann <arnd@arndb.de> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Jim Mattson 提交于
Immediately following MOV-to-SS/POP-to-SS, VM-entry is disallowed. This check comes after the check for a valid VMCS. When this check fails, the instruction pointer should fall through to the next instruction, the ALU flags should be set to indicate VMfailValid, and the VM-instruction error should be set to 26 ("VM entry with events blocked by MOV SS"). Signed-off-by: NJim Mattson <jmattson@google.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-
由 Paolo Bonzini 提交于
vmx_recover_nmi_blocking is using a cached value of the guest interruptibility info, which is stored in vmx->nmi_known_unmasked. vmx_recover_nmi_blocking is run for both normal and nested guests, so the cached value must be per-VMCS. This fixes eventinj.flat in a nested non-EPT environment. With EPT it works, because the EPT violation handler doesn't have the vmx->nmi_known_unmasked optimization (it is unnecessary because, unlike vmx_recover_nmi_blocking, it can just look at the exit qualification). Thanks to Wanpeng Li for debugging the testcase and providing an initial patch. Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com> Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
-