1. 02 8月, 2017 2 次提交
    • V
      x86/intel_rdt: Introduce a common compile option for RDT · f01d7d51
      Vikas Shivappa 提交于
      We currently have a CONFIG_RDT_A which is for RDT(Resource directory
      technology) allocation based resctrl filesystem interface. As a
      preparation to add support for RDT monitoring as well into the same
      resctrl filesystem, change the config option to be CONFIG_RDT which
      would include both RDT allocation and monitoring code.
      
      No functional change.
      Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: ravi.v.shankar@intel.com
      Cc: tony.luck@intel.com
      Cc: fenghua.yu@intel.com
      Cc: peterz@infradead.org
      Cc: eranian@google.com
      Cc: vikas.shivappa@intel.com
      Cc: ak@linux.intel.com
      Cc: davidcc@google.com
      Cc: reinette.chatre@intel.com
      Link: http://lkml.kernel.org/r/1501017287-28083-4-git-send-email-vikas.shivappa@linux.intel.com
      f01d7d51
    • V
      x86/perf/cqm: Wipe out perf based cqm · c39a0e2c
      Vikas Shivappa 提交于
      'perf cqm' never worked due to the incompatibility between perf
      infrastructure and cqm hardware support.  The hardware uses RMIDs to
      track the llc occupancy of tasks and these RMIDs are per package. This
      makes monitoring a hierarchy like cgroup along with monitoring of tasks
      separately difficult and several patches sent to lkml to fix them were
      NACKed. Further more, the following issues in the current perf cqm make
      it almost unusable:
      
          1. No support to monitor the same group of tasks for which we do
          allocation using resctrl.
      
          2. It gives random and inaccurate data (mostly 0s) once we run out
          of RMIDs due to issues in Recycling.
      
          3. Recycling results in inaccuracy of data because we cannot
          guarantee that the RMID was stolen from a task when it was not
          pulling data into cache or even when it pulled the least data. Also
          for monitoring llc_occupancy, if we stop using an RMID_x and then
          start using an RMID_y after we reclaim an RMID from an other event,
          we miss accounting all the occupancy that was tagged to RMID_x at a
          later perf_count.
      
          2. Recycling code makes the monitoring code complex including
          scheduling because the event can lose RMID any time. Since MBM
          counters count bandwidth for a period of time by taking snap shot of
          total bytes at two different times, recycling complicates the way we
          count MBM in a hierarchy. Also we need a spin lock while we do the
          processing to account for MBM counter overflow. We also currently
          use a spin lock in scheduling to prevent the RMID from being taken
          away.
      
          4. Lack of support when we run different kind of event like task,
          system-wide and cgroup events together. Data mostly prints 0s. This
          is also because we can have only one RMID tied to a cpu as defined
          by the cqm hardware but a perf can at the same time tie multiple
          events during one sched_in.
      
          5. No support of monitoring a group of tasks. There is partial support
          for cgroup but it does not work once there is a hierarchy of cgroups
          or if we want to monitor a task in a cgroup and the cgroup itself.
      
          6. No support for monitoring tasks for the lifetime without perf
          overhead.
      
          7. It reported the aggregate cache occupancy or memory bandwidth over
          all sockets. But most cloud and VMM based use cases want to know the
          individual per-socket usage.
      Signed-off-by: NVikas Shivappa <vikas.shivappa@linux.intel.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: ravi.v.shankar@intel.com
      Cc: tony.luck@intel.com
      Cc: fenghua.yu@intel.com
      Cc: peterz@infradead.org
      Cc: eranian@google.com
      Cc: vikas.shivappa@intel.com
      Cc: ak@linux.intel.com
      Cc: davidcc@google.com
      Cc: reinette.chatre@intel.com
      Link: http://lkml.kernel.org/r/1501017287-28083-2-git-send-email-vikas.shivappa@linux.intel.com
      c39a0e2c
  2. 28 7月, 2017 1 次提交
    • M
      x86/boot: Disable the address-of-packed-member compiler warning · 20c6c189
      Matthias Kaehlcke 提交于
      The clang warning 'address-of-packed-member' is disabled for the general
      kernel code, also disable it for the x86 boot code.
      
      This suppresses a bunch of warnings like this when building with clang:
      
      ./arch/x86/include/asm/processor.h:535:30: warning: taking address of
        packed member 'sp0' of class or structure 'x86_hw_tss' may result in an
        unaligned pointer value [-Waddress-of-packed-member]
          return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
                                      ^~~~~~~~~~~~~~~~~~~
      ./arch/x86/include/asm/percpu.h:391:59: note: expanded from macro
        'this_cpu_read_stable'
          #define this_cpu_read_stable(var)       percpu_stable_op("mov", var)
                                                                          ^~~
      ./arch/x86/include/asm/percpu.h:228:16: note: expanded from macro
        'percpu_stable_op'
          : "p" (&(var)));
                   ^~~
      Signed-off-by: NMatthias Kaehlcke <mka@chromium.org>
      Cc: Doug Anderson <dianders@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170725215053.135586-1-mka@chromium.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      20c6c189
  3. 27 7月, 2017 5 次提交
    • W
      KVM: LAPIC: Fix reentrancy issues with preempt notifiers · 1d518c68
      Wanpeng Li 提交于
      Preempt can occur in the preemption timer expiration handler:
      
                CPU0                    CPU1
      
        preemption timer vmexit
        handle_preemption_timer(vCPU0)
          kvm_lapic_expired_hv_timer
            hv_timer_is_use == true
        sched_out
                                 sched_in
                                 kvm_arch_vcpu_load
                                   kvm_lapic_restart_hv_timer
                                     restart_apic_timer
                                       start_hv_timer
                                         already-expired timer or sw timer triggerd in the window
                                       start_sw_timer
                                         cancel_hv_timer
                                 /* back in kvm_lapic_expired_hv_timer */
                                 cancel_hv_timer
                                   WARN_ON(!apic->lapic_timer.hv_timer_in_use);  ==> Oops
      
      This can be reproduced if CONFIG_PREEMPT is enabled.
      
      ------------[ cut here ]------------
       WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1563 kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm]
       CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G           OE   4.13.0-rc2+ #16
       RIP: 0010:kvm_lapic_expired_hv_timer+0x9e/0xb0 [kvm]
      Call Trace:
        handle_preemption_timer+0xe/0x20 [kvm_intel]
        vmx_handle_exit+0xb8/0xd70 [kvm_intel]
        kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm]
        ? kvm_arch_vcpu_load+0x47/0x230 [kvm]
        ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
        kvm_vcpu_ioctl+0x340/0x700 [kvm]
        ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
        ? __fget+0xfc/0x210
        do_vfs_ioctl+0xa4/0x6a0
        ? __fget+0x11d/0x210
        SyS_ioctl+0x79/0x90
        do_syscall_64+0x81/0x220
        entry_SYSCALL64_slow_path+0x25/0x25
       ------------[ cut here ]------------
       WARNING: CPU: 4 PID: 2972 at /home/kernel/linux/arch/x86/kvm//lapic.c:1498 cancel_hv_timer.isra.40+0x4f/0x60 [kvm]
       CPU: 4 PID: 2972 Comm: qemu-system-x86 Tainted: G        W  OE   4.13.0-rc2+ #16
       RIP: 0010:cancel_hv_timer.isra.40+0x4f/0x60 [kvm]
      Call Trace:
        kvm_lapic_expired_hv_timer+0x3e/0xb0 [kvm]
        handle_preemption_timer+0xe/0x20 [kvm_intel]
        vmx_handle_exit+0xb8/0xd70 [kvm_intel]
        kvm_arch_vcpu_ioctl_run+0xdd1/0x1be0 [kvm]
        ? kvm_arch_vcpu_load+0x47/0x230 [kvm]
        ? kvm_arch_vcpu_load+0x62/0x230 [kvm]
        kvm_vcpu_ioctl+0x340/0x700 [kvm]
        ? kvm_vcpu_ioctl+0x340/0x700 [kvm]
        ? __fget+0xfc/0x210
        do_vfs_ioctl+0xa4/0x6a0
        ? __fget+0x11d/0x210
        SyS_ioctl+0x79/0x90
        do_syscall_64+0x81/0x220
        entry_SYSCALL64_slow_path+0x25/0x25
      
      This patch fixes it by making the caller of cancel_hv_timer, start_hv_timer
      and start_sw_timer be in preemption-disabled regions, which trivially
      avoid any reentrancy issue with preempt notifier.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      [Add more WARNs. - Paolo]
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      1d518c68
    • W
      KVM: nVMX: Fix loss of L2's NMI blocking state · 2d6144e3
      Wanpeng Li 提交于
      Run kvm-unit-tests/eventinj.flat in L1 w/ ept=0 on both L0 and L1:
      
      Before NMI IRET test
      Sending NMI to self
      NMI isr running stack 0x461000
      Sending nested NMI to self
      After nested NMI to self
      Nested NMI isr running rip=40038e
      After iret
      After NMI to self
      FAIL: NMI
      
      Commit 4c4a6f79 (KVM: nVMX: track NMI blocking state separately
      for each VMCS) tracks NMI blocking state separately for vmcs01 and
      vmcs02. However it is not enough:
      
       - The L2 (kvm-unit-tests/eventinj.flat) generates NMI that will fault
         on IRET, so the L2 can generate #PF which can be intercepted by L0.
       - L0 walks L1's guest page table and sees the mapping is invalid, it
         resumes the L1 guest and injects the #PF into L1.  At this point the
         vmcs02 has nmi_known_unmasked=true.
       - L1 sets set bit 3 (blocking by NMI) in the interruptibility-state field
         of vmcs12 (and fixes the shadow page table) before resuming L2 guest.
       - L1 executes VMRESUME to resume L2, causing a vmexit to L0
       - during VMRESUME emulation, prepare_vmcs02 sets bit 3 in the
         interruptibility-state field of vmcs02, but nmi_known_unmasked is
         still true.
       - L2 immediately exits to L0 with another page fault, because L0 still has
         not updated the NGVA->HPA page tables.  However, nmi_known_unmasked is
         true so vmx_recover_nmi_blocking does not do anything.
      
      The fix is to update nmi_known_unmasked when preparing vmcs02 from vmcs12.
      
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      2d6144e3
    • W
      KVM: nVMX: Fix posted intr delivery when vcpu is in guest mode · 06a5524f
      Wincy Van 提交于
      The PI vector for L0 and L1 must be different. If dest vcpu0
      is in guest mode while vcpu1 is delivering a non-nested PI to
      vcpu0, there wont't be any vmexit so that the non-nested interrupt
      will be delayed.
      Signed-off-by: NWincy Van <fanwenyi0529@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      06a5524f
    • W
      x86: irq: Define a global vector for nested posted interrupts · 210f84b0
      Wincy Van 提交于
      We are using the same vector for nested/non-nested posted
      interrupts delivery, this may cause interrupts latency in
      L1 since we can't kick the L2 vcpu out of vmx-nonroot mode.
      
      This patch introduces a new vector which is only for nested
      posted interrupts to solve the problems above.
      Signed-off-by: NWincy Van <fanwenyi0529@gmail.com>
      Signed-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      210f84b0
    • P
      KVM: x86: do mask out upper bits of PAE CR3 · a512177e
      Paolo Bonzini 提交于
      This reverts the change of commit f85c758d,
      as the behavior it modified was intended.
      
      The VM is running in 32-bit PAE mode, and Table 4-7 of the Intel manual
      says:
      
      Table 4-7. Use of CR3 with PAE Paging
      Bit Position(s)	Contents
      4:0		Ignored
      31:5		Physical address of the 32-Byte aligned
      		page-directory-pointer table used for linear-address
      		translation
      63:32		Ignored (these bits exist only on processors supporting
      		the Intel-64 architecture)
      
      To placate the static checker, write the mask explicitly as an
      unsigned long constant instead of using a 32-bit unsigned constant.
      
      Cc: Dan Carpenter <dan.carpenter@oracle.com>
      Fixes: f85c758dSigned-off-by: NPaolo Bonzini <pbonzini@redhat.com>
      a512177e
  4. 25 7月, 2017 2 次提交
  5. 24 7月, 2017 8 次提交
  6. 23 7月, 2017 2 次提交
  7. 21 7月, 2017 4 次提交
    • R
      x86/devicetree: Convert to using %pOF instead of ->full_name · db15e7f2
      Rob Herring 提交于
      Now that we have a custom printf format specifier, convert users of
      full_name to use %pOF instead. This is preparation to remove storing
      of the full path string for each device node.
      Signed-off-by: NRob Herring <robh@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: devicetree@vger.kernel.org
      Link: http://lkml.kernel.org/r/20170718214339.7774-7-robh@kernel.org
      [ Clarify the error message while at it, as 'node' is ambiguous. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      db15e7f2
    • J
      perf/x86/intel: Add proper condition to run sched_task callbacks · df6c3db8
      Jiri Olsa 提交于
      We have 2 functions using the same sched_task callback:
      
        - PEBS drain for free running counters
        - LBR save/store
      
      Both of them are called from intel_pmu_sched_task() and
      either of them can be unwillingly triggered when the
      other one is configured to run.
      
      Let's say there's PEBS drain configured in sched_task
      callback for the event, but in the callback itself
      (intel_pmu_sched_task()) we will also run the code for
      LBR save/restore, which we did not ask for, but the
      code in intel_pmu_sched_task() does not check for that.
      
      This can lead to extra cycles in some perf monitoring,
      like when we monitor PEBS event without LBR data.
      
        # perf record --no-timestamp -c 10000 -e cycles:p ./perf bench sched pipe -l 1000000
      
        (We need PEBS, non freq/non timestamp event to enable
         the sched_task callback)
      
      The perf stat of cycles and msr:write_msr for above
      command before the change:
        ...
        Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \
                                       ./perf bench sched pipe -l 1000000' (5 runs):
      
          18,519,557,441      cycles:k
              91,195,527      msr:write_msr
      
            29.334476406 seconds time elapsed
      
      And after the change:
        ...
        Performance counter stats for './perf record --no-timestamp -c 10000 -e cycles:p \
                                       ./perf bench sched pipe -l 1000000' (5 runs):
      
          18,704,973,540      cycles:k
              27,184,720      msr:write_msr
      
            16.977875900 seconds time elapsed
      
      There's no affect on cycles:k because the sched_task happens
      with events switched off, however the msr:write_msr tracepoint
      counter together with almost 50% of time speedup show the
      improvement.
      
      Monitoring LBR event and having extra PEBS drain processing
      in sched_task callback showed just a little speedup, because
      the drain function does not do much extra work in case there
      is no PEBS data.
      
      Adding conditions to recognize the configured work that needs
      to be done in the x86_pmu's sched_task callback.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NJiri Olsa <jolsa@kernel.org>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Kan Liang <kan.liang@intel.com>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Jiri Olsa <jolsa@kernel.org>
      Link: http://lkml.kernel.org/r/20170719075247.GA27506@kravaSigned-off-by: NIngo Molnar <mingo@kernel.org>
      df6c3db8
    • A
      x86/platform/uv/BAU: Disable BAU on single hub configurations · 2fe9a5c6
      Andrew Banman 提交于
      The BAU confers no benefit to a UV system running with only one hub/socket.
      Permanently disable the BAU driver if there are less than two hubs online
      to avoid BAU overhead. We have observed failed boots on single-socket UV4
      systems caused by BAU that are avoided with this patch.
      
      Also, while at it, consolidate initialization error blocks and fix a
      memory leak.
      Signed-off-by: NAndrew Banman <abanman@hpe.com>
      Acked-by: NRuss Anderson <rja@hpe.com>
      Acked-by: NMike Travis <mike.travis@hpe.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: tony.ernst@hpe.com
      Link: http://lkml.kernel.org/r/1500588351-78016-1-git-send-email-abanman@hpe.com
      [ Minor cleanups. ]
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      2fe9a5c6
    • L
      x86: mark kprobe templates as character arrays, not single characters · 54a7d50b
      Linus Torvalds 提交于
      They really are, and the "take the address of a single character" makes
      the string fortification code unhappy (it believes that you can now only
      acccess one byte, rather than a byte range, and then raises errors for
      the memory copies going on in there).
      
      We could now remove a few 'addressof' operators (since arrays naturally
      degrade to pointers), but this is the minimal patch that just changes
      the C prototypes of those template arrays (the templates themselves are
      defined in inline asm).
      Reported-by: Nkernel test robot <xiaolong.ye@intel.com>
      Acked-and-tested-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Cc: Daniel Micay <danielmicay@gmail.com>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      54a7d50b
  8. 20 7月, 2017 13 次提交
    • R
      kvm: x86: hyperv: avoid livelock in oneshot SynIC timers · f1ff89ec
      Roman Kagan 提交于
      If the SynIC timer message delivery fails due to SINT message slot being
      busy, there's no point to attempt starting the timer again until we're
      notified of the slot being released by the guest (via EOM or EOI).
      
      Even worse, when a oneshot timer fails to deliver its message, its
      re-arming with an expiration time in the past leads to immediate retry
      of the delivery, and so on, without ever letting the guest vcpu to run
      and release the slot, which results in a livelock.
      
      To avoid that, only start the timer when there's no timer message
      pending delivery.  When there is, meaning the slot is busy, the
      processing will be restarted upon notification from the guest that the
      slot is released.
      Signed-off-by: NRoman Kagan <rkagan@virtuozzo.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      f1ff89ec
    • W
      KVM: VMX: Fix invalid guest state detection after task-switch emulation · f244deed
      Wanpeng Li 提交于
      This can be reproduced by EPT=1, unrestricted_guest=N, emulate_invalid_state=Y
      or EPT=0, the trace of kvm-unit-tests/taskswitch2.flat is like below, it tries
      to emulate invalid guest state task-switch:
      
      kvm_exit: reason TASK_SWITCH rip 0x0 info 40000058 0
      kvm_emulate_insn: 42000:0:0f 0b (0x2)
      kvm_emulate_insn: 42000:0:0f 0b (0x2) failed
      kvm_inj_exception: #UD (0x0)
      kvm_entry: vcpu 0
      kvm_exit: reason TASK_SWITCH rip 0x0 info 40000058 0
      kvm_emulate_insn: 42000:0:0f 0b (0x2)
      kvm_emulate_insn: 42000:0:0f 0b (0x2) failed
      kvm_inj_exception: #UD (0x0)
      ......................
      
      It appears that the task-switch emulation updates rflags (and vm86
      flag) only after the segments are loaded, causing vmx->emulation_required
      to be set, when in fact invalid guest state emulation is not needed.
      
      This patch fixes it by updating vmx->emulation_required after the
      rflags (and vm86 flag) is updated in task-switch emulation.
      
      Thanks Radim for moving the update to vmx__set_flags and adding Paolo's
      suggestion for the check.
      Suggested-by: NNadav Amit <nadav.amit@gmail.com>
      Cc: Paolo Bonzini <pbonzini@redhat.com>
      Cc: Radim Krčmář <rkrcmar@redhat.com>
      Cc: Nadav Amit <nadav.amit@gmail.com>
      Signed-off-by: NWanpeng Li <wanpeng.li@hotmail.com>
      Signed-off-by: NRadim Krčmář <rkrcmar@redhat.com>
      f244deed
    • J
      debug: Fix WARN_ON_ONCE() for modules · 325cdacd
      Josh Poimboeuf 提交于
      Mike Galbraith reported a situation where a WARN_ON_ONCE() call in DRM
      code turned into an oops.  As it turns out, WARN_ON_ONCE() seems to be
      completely broken when called from a module.
      
      The bug was introduced with the following commit:
      
        19d43626 ("debug: Add _ONCE() logic to report_bug()")
      
      That commit changed WARN_ON_ONCE() to move its 'once' logic into the bug
      trap handler.  It requires a writable bug table so that the BUGFLAG_DONE
      bit can be written to the flags to indicate the first warning has
      occurred.
      
      The bug table was made writable for vmlinux, which relies on
      vmlinux.lds.S and vmlinux.lds.h for laying out the sections.  However,
      it wasn't made writable for modules, which rely on the ELF section
      header flags.
      Reported-by: NMike Galbraith <efault@gmx.de>
      Tested-by: NMasami Hiramatsu <mhiramat@kernel.org>
      Signed-off-by: NJosh Poimboeuf <jpoimboe@redhat.com>
      Acked-by: NPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Fixes: 19d43626 ("debug: Add _ONCE() logic to report_bug()")
      Link: http://lkml.kernel.org/r/a53b04235a65478dd9afc51f5b329fdc65c84364.1500095401.git.jpoimboe@redhat.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      325cdacd
    • A
      x86/platform/intel-mid: Fix a format string overflow warning · 0bc73048
      Arnd Bergmann 提交于
      We have space for exactly three characters for the index in "max7315_%d_base",
      but as GCC points out having more would cause an string overflow:
      
        arch/x86/platform/intel-mid/device_libs/platform_max7315.c: In function 'max7315_platform_data':
        arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: error: '%d' directive writing between 1 and 11 bytes into a region of size 9 [-Werror=format-overflow=]
           sprintf(base_pin_name, "max7315_%d_base", nr);
                                ^~~~~~~~~~~~~~~~~
        arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:26: note: directive argument in the range [-2147483647, 2147483647]
        arch/x86/platform/intel-mid/device_libs/platform_max7315.c:41:3: note: 'sprintf' output between 15 and 25 bytes into a destination of size 17
           sprintf(base_pin_name, "max7315_%d_base", nr);
      
      This makes it use an snprintf() to truncate the string if that happened
      rather than overflowing the stack. In practice, this is safe, because
      there won't be a large number of max7315 devices in the systems, and
      both the format and the length are defined by the firmware interface.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170719125310.2487451-9-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0bc73048
    • A
      x86/platform: Add PCI dependency for PUNIT_ATOM_DEBUG · d689c64d
      Arnd Bergmann 提交于
      The IOSF_MBI option requires PCI support, without it we get a harmless
      Kconfig warning when it gets selected by PUNIT_ATOM_DEBUG:
      
        warning: (X86_INTEL_LPSS && SND_SST_IPC_ACPI && MMC_SDHCI_ACPI && PUNIT_ATOM_DEBUG) selects IOSF_MBI which has unmet direct dependencies (PCI)
      
      This adds another dependency to avoid the warning.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Kees Cook <keescook@chromium.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170719125310.2487451-8-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d689c64d
    • A
      x86/build: Silence the build with "make -s" · d460131d
      Arnd Bergmann 提交于
      Every kernel build on x86 will result in some output:
      
        Setup is 13084 bytes (padded to 13312 bytes).
        System is 4833 kB
        CRC 6d35fa35
        Kernel: arch/x86/boot/bzImage is ready  (#2)
      
      This shuts it up, so that 'make -s' is truely silent as long as
      everything works. Building without '-s' should produce unchanged
      output.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Matt Fleming <matt@codeblueprint.co.uk>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170719125310.2487451-6-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      d460131d
    • A
      x86/io: Add "memory" clobber to insb/insw/insl/outsb/outsw/outsl · 7206f9bf
      Arnd Bergmann 提交于
      The x86 version of insb/insw/insl uses an inline assembly that does
      not have the target buffer listed as an output. This can confuse
      the compiler, leading it to think that a subsequent access of the
      buffer is uninitialized:
      
        drivers/net/wireless/wl3501_cs.c: In function ‘wl3501_mgmt_scan_confirm’:
        drivers/net/wireless/wl3501_cs.c:665:9: error: ‘sig.status’ is used uninitialized in this function [-Werror=uninitialized]
        drivers/net/wireless/wl3501_cs.c:668:12: error: ‘sig.cap_info’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
        drivers/net/sb1000.c: In function 'sb1000_rx':
        drivers/net/sb1000.c:775:9: error: 'st[0]' is used uninitialized in this function [-Werror=uninitialized]
        drivers/net/sb1000.c:776:10: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
        drivers/net/sb1000.c:784:11: error: 'st[1]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      I tried to mark the exact input buffer as an output here, but couldn't
      figure it out. As suggested by Linus, marking all memory as clobbered
      however is good enough too. For the outs operations, I also add the
      memory clobber, to force the input to be written to local variables.
      This is probably already guaranteed by the "asm volatile", but it can't
      hurt to do this for symmetry.
      Suggested-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Borislav Petkov <bp@suse.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: Tom Lendacky <thomas.lendacky@amd.com>
      Link: http://lkml.kernel.org/r/20170719125310.2487451-5-arnd@arndb.de
      Link: https://lkml.org/lkml/2017/7/12/605Signed-off-by: NIngo Molnar <mingo@kernel.org>
      7206f9bf
    • A
      x86/fpu/math-emu: Avoid bogus -Wint-in-bool-context warning · 5623452a
      Arnd Bergmann 提交于
      gcc-7.1.1 produces this warning:
      
        arch/x86/math-emu/reg_add_sub.c: In function 'FPU_add':
        arch/x86/math-emu/reg_add_sub.c:80:48: error: ?: using integer constants in boolean context [-Werror=int-in-bool-context]
      
      This appears to be a bug in gcc-7.1.1, and I have reported it as
      PR81484. The compiler suggests that code written as
      
      	if (a & b ? c : d)
      
      is usually incorrect and should have been
      
      	if (a & (b ? c : d))
      
      However, in this case, we correctly write
      
      	if ((a & b) ? c : d)
      
      and should not get a warning for it.
      
      This adds a dirty workaround for the problem, adding a comparison with
      zero inside of the macro. The warning is currently disabled in the kernel,
      so we may decide not to apply the patch, and instead wait for future gcc
      releases to fix the problem. On the other hand, it seems to be the
      only instance of this particular problem.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Bill Metzenthen <billm@melbpc.org.au>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170719125310.2487451-4-arnd@arndb.de
      Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81484Signed-off-by: NIngo Molnar <mingo@kernel.org>
      5623452a
    • A
      x86/fpu/math-emu: Fix possible uninitialized variable use · 75e2f0a6
      Arnd Bergmann 提交于
      When building the kernel with "make EXTRA_CFLAGS=...", this overrides
      the "PARANOID" preprocessor macro defined in arch/x86/math-emu/Makefile,
      and we run into a build warning:
      
        arch/x86/math-emu/reg_compare.c: In function ‘compare_i_st_st’:
        arch/x86/math-emu/reg_compare.c:254:6: error: ‘f’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
      
      This fixes the implementation to work correctly even without the PARANOID
      flag, and also fixes the Makefile to not use the EXTRA_CFLAGS variable
      but instead use the ccflags-y variable in the Makefile that is meant
      for this purpose.
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Bill Metzenthen <billm@melbpc.org.au>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170719125310.2487451-3-arnd@arndb.deSigned-off-by: NIngo Molnar <mingo@kernel.org>
      75e2f0a6
    • A
      perf/x86: Shut up false-positive -Wmaybe-uninitialized warning · 11d8b058
      Arnd Bergmann 提交于
      The intialization function checks for various failure scenarios, but
      unfortunately the compiler gets a little confused about the possible
      combinations, leading to a false-positive build warning when
      -Wmaybe-uninitialized is set:
      
        arch/x86/events/core.c: In function ‘init_hw_perf_events’:
        arch/x86/events/core.c:264:3: warning: ‘reg_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized]
        arch/x86/events/core.c:264:3: warning: ‘val_fail’ may be used uninitialized in this function [-Wmaybe-uninitialized]
           pr_err(FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n",
      
      We can't actually run into this case, so this shuts up the warning
      by initializing the variables to a known-invalid state.
      Suggested-by: NPeter Zijlstra <peterz@infradead.org>
      Signed-off-by: NArnd Bergmann <arnd@arndb.de>
      Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
      Cc: Andy Lutomirski <luto@kernel.org>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Josh Poimboeuf <jpoimboe@redhat.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/20170719125310.2487451-2-arnd@arndb.de
      Link: https://patchwork.kernel.org/patch/9392595/Signed-off-by: NIngo Molnar <mingo@kernel.org>
      11d8b058
    • K
      x86/defconfig: Remove stale, old Kconfig options · 0e7f0b6c
      Krzysztof Kozlowski 提交于
      Remove old, dead Kconfig options (in order appearing in this commit):
      
       - EXPERIMENTAL is gone since v3.9;
       - IP_NF_TARGET_ULOG: commit d4da843e ("netfilter: kill remnants of ulog targets");
       - USB_LIBUSUAL: commit f61870ee ("usb: remove libusual");
      Signed-off-by: NKrzysztof Kozlowski <krzk@kernel.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Link: http://lkml.kernel.org/r/1500526885-4341-1-git-send-email-krzk@kernel.orgSigned-off-by: NIngo Molnar <mingo@kernel.org>
      0e7f0b6c
    • S
      x86/ioapic: Pass the correct data to unmask_ioapic_irq() · e708e35b
      Seunghun Han 提交于
      One of the rarely executed code pathes in check_timer() calls
      unmask_ioapic_irq() passing irq_get_chip_data(0) as argument.
      
      That's wrong as unmask_ioapic_irq() expects a pointer to the irq data of
      interrupt 0. irq_get_chip_data(0) returns NULL, so the following
      dereference in unmask_ioapic_irq() causes a kernel panic.
      
      The issue went unnoticed in the first place because irq_get_chip_data()
      returns a void pointer so the compiler cannot do a type check on the
      argument. The code path was added for machines with broken configuration,
      but it seems that those machines are either not running current kernels or
      simply do not longer exist.
      
      Hand in irq_get_irq_data(0) as argument which provides the correct data.
      
      [ tglx: Rewrote changelog ]
      
      Fixes: 4467715a ("x86/irq: Move irq_cfg.irq_2_pin into io_apic.c")
      Signed-off-by: NSeunghun Han <kkamagui@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: stable@vger.kernel.org
      Link: http://lkml.kernel.org/r/1500369644-45767-1-git-send-email-kkamagui@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      e708e35b
    • S
      x86/acpi: Prevent out of bound access caused by broken ACPI tables · dad5ab0d
      Seunghun Han 提交于
      The bus_irq argument of mp_override_legacy_irq() is used as the index into
      the isa_irq_to_gsi[] array. The bus_irq argument originates from
      ACPI_MADT_TYPE_IO_APIC and ACPI_MADT_TYPE_INTERRUPT items in the ACPI
      tables, but is nowhere sanity checked.
      
      That allows broken or malicious ACPI tables to overwrite memory, which
      might cause malfunction, panic or arbitrary code execution.
      
      Add a sanity check and emit a warning when that triggers.
      
      [ tglx: Added warning and rewrote changelog ]
      Signed-off-by: NSeunghun Han <kkamagui@gmail.com>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: security@kernel.org
      Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
      Cc: stable@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      dad5ab0d
  9. 19 7月, 2017 3 次提交