1. 17 1月, 2014 1 次提交
  2. 16 1月, 2014 4 次提交
    • R
      perf/x86/amd/ibs: Fix waking up from S3 for AMD family 10h · bee09ed9
      Robert Richter 提交于
      On AMD family 10h we see following error messages while waking up from
      S3 for all non-boot CPUs leading to a failed IBS initialization:
      
       Enabling non-boot CPUs ...
       smpboot: Booting Node 0 Processor 1 APIC 0x1
       [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
       perf: IBS APIC setup failed on cpu #1
       process: Switch to broadcast mode on CPU1
       CPU1 is up
       ...
       ACPI: Waking up from system sleep state S3
      
      Reason for this is that during suspend the LVT offset for the IBS
      vector gets lost and needs to be reinialized while resuming.
      
      The offset is read from the IBSCTL msr. On family 10h the offset needs
      to be 1 as offset 0 is used for the MCE threshold interrupt, but
      firmware assings it for IBS to 0 too. The kernel needs to reprogram
      the vector. The msr is a readonly node msr, but a new value can be
      written via pci config space access. The reinitialization is
      implemented for family 10h in setup_ibs_ctl() which is forced during
      IBS setup.
      
      This patch fixes IBS setup after waking up from S3 by adding
      resume/supend hooks for the boot cpu which does the offset
      reinitialization.
      
      Marking it as stable to let distros pick up this fix.
      Signed-off-by: NRobert Richter <rric@kernel.org>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: <stable@vger.kernel.org> v3.2..
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Link: http://lkml.kernel.org/r/1389797849-5565-1-git-send-email-rric.net@gmail.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      bee09ed9
    • B
      x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs · 7da7c156
      Bin Gao 提交于
      On SoCs that have the calibration MSRs available, either there is no
      PIT, HPET or PMTIMER to calibrate against, or the PIT/HPET/PMTIMER is
      driven from the same clock as the TSC, so calibration is redundant and
      just slows down the boot.
      
      TSC rate is caculated by this formula:
      <maximum core-clock to bus-clock ratio> * <maximum resolved frequency>
      The ratio and the resolved frequency ID can be obtained from MSR.
      See Intel 64 and IA-32 System Programming Guid section 16.12 and 30.11.5
      for details.
      Signed-off-by: NBin Gao <bin.gao@intel.com>
      Signed-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Link: http://lkml.kernel.org/n/tip-rgm7xmg7k6qnjlw3ynkcjsmh@git.kernel.org
      7da7c156
    • H
      x86, apic: Make disabled_cpu_apicid static read_mostly, fix typos · 5b4d1dbc
      H. Peter Anvin 提交于
      Make disabled_cpu_apicid static and read_mostly, and fix a couple of
      typos.
      Reported-by: NIngo Molnar <mingo@kernel.org>
      Link: http://lkml.kernel.org/r/20140115182511.GA22737@gmail.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      5b4d1dbc
    • H
      x86, apic, kexec: Add disable_cpu_apicid kernel parameter · 151e0c7d
      HATAYAMA Daisuke 提交于
      Add disable_cpu_apicid kernel parameter. To use this kernel parameter,
      specify an initial APIC ID of the corresponding CPU you want to
      disable.
      
      This is mostly used for the kdump 2nd kernel to disable BSP to wake up
      multiple CPUs without causing system reset or hang due to sending INIT
      from AP to BSP.
      
      Kdump users first figure out initial APIC ID of the BSP, CPU0 in the
      1st kernel, for example from /proc/cpuinfo and then set up this kernel
      parameter for the 2nd kernel using the obtained APIC ID.
      
      However, doing this procedure at each boot time manually is awkward,
      which should be automatically done by user-land service scripts, for
      example, kexec-tools on fedora/RHEL distributions.
      
      This design is more flexible than disabling BSP in kernel boot time
      automatically in that in kernel boot time we have no choice but
      referring to ACPI/MP table to obtain initial APIC ID for BSP, meaning
      that the method is not applicable to the systems without such BIOS
      tables.
      
      One assumption behind this design is that users get initial APIC ID of
      the BSP in still healthy state and so BSP is uniquely kept in
      CPU0. Thus, through the kernel parameter, only one initial APIC ID can
      be specified.
      
      In a comparison with disabled_cpu_apicid, we use read_apic_id(), not
      boot_cpu_physical_apicid, because on some platforms, the variable is
      modified to the apicid reported as BSP through MP table and this
      function is executed with the temporarily modified
      boot_cpu_physical_apicid. As a result, disabled_cpu_apicid kernel
      parameter doesn't work well for apicids of APs.
      
      Fixing the wrong handling of boot_cpu_physical_apicid requires some
      reviews and tests beyond some platforms and it could take some
      time. The fix here is a kind of workaround to focus on the main topic
      of this patch.
      Signed-off-by: NHATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
      Link: http://lkml.kernel.org/r/20140115064458.1545.38775.stgit@localhost6.localdomain6Signed-off-by: NH. Peter Anvin <hpa@zytor.com>
      151e0c7d
  3. 15 1月, 2014 1 次提交
  4. 14 1月, 2014 5 次提交
  5. 13 1月, 2014 5 次提交
  6. 12 1月, 2014 3 次提交
  7. 10 1月, 2014 1 次提交
  8. 09 1月, 2014 1 次提交
    • D
      arch: x86: New MailBox support driver for Intel SOC's · 46184415
      David E. Box 提交于
      Current Intel SOC cores use a MailBox Interface (MBI) to provide access to
      configuration registers on devices (called units) connected to the system
      fabric. This is a support driver that implements access to this interface on
      those platforms that can enumerate the device using PCI. Initial support is for
      BayTrail, for which port definitons are provided. This is a requirement for
      implementing platform specific features (e.g. RAPL driver requires this to
      perform platform specific power management using the registers in PUNIT).
      Dependant modules should select IOSF_MBI in their respective Kconfig
      configuraiton. Serialized access is handled by all exported routines with
      spinlocks.
      
      The API includes 3 functions for access to unit registers:
      
      int iosf_mbi_read(u8 port, u8 opcode, u32 offset, u32 *mdr)
      int iosf_mbi_write(u8 port, u8 opcode, u32 offset, u32 mdr)
      int iosf_mbi_modify(u8 port, u8 opcode, u32 offset, u32 mdr, u32 mask)
      
      port:	indicating the unit being accessed
      opcode:	the read or write port specific opcode
      offset:	the register offset within the port
      mdr:	the register data to be read, written, or modified
      mask:	bit locations in mdr to change
      
      Returns nonzero on error
      
      Note: GPU code handles access to the GFX unit. Therefore access to that unit
      with this driver is disallowed to avoid conflicts.
      Signed-off-by: NDavid E. Box <david.e.box@linux.intel.com>
      Link: http://lkml.kernel.org/r/1389216471-734-1-git-send-email-david.e.box@linux.intel.comSigned-off-by: NH. Peter Anvin <hpa@linux.intel.com>
      Cc: Rafael J. Wysocki <rjw@rjwysocki.net>
      Cc: Matthew Garrett <mjg59@srcf.ucam.org>
      46184415
  9. 07 1月, 2014 1 次提交
  10. 04 1月, 2014 1 次提交
  11. 03 1月, 2014 1 次提交
    • D
      x86: ksysfs.c build fix · 41a34cec
      Dave Young 提交于
      kbuild test robot report below error for randconfig:
      
        arch/x86/kernel/ksysfs.c: In function 'get_setup_data_paddr':
        arch/x86/kernel/ksysfs.c:81:3: error: implicit declaration of function 'ioremap_cache' [-Werror=implicit-function-declaration]
        arch/x86/kernel/ksysfs.c:86:3: error: implicit declaration of function 'iounmap' [-Werror=implicit-function-declaration]
      
      Fix it by including <asm/io.h> in ksysfs.c
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      41a34cec
  12. 29 12月, 2013 3 次提交
    • D
      x86: Reserve setup_data ranges late after parsing memmap cmdline · 77ea8c94
      Dave Young 提交于
      Currently e820_reserve_setup_data() is called before parsing early
      params, it works in normal case. But for memmap=exactmap, the final
      memory ranges are created after parsing memmap= cmdline params, so the
      previous e820_reserve_setup_data() has no effect. For example,
      setup_data ranges will still be marked as normal system ram, thus when
      later sysfs driver ioremap them kernel will warn about mapping normal
      ram.
      
      This patch fix it by moving the e820_reserve_setup_data() callback after
      parsing early params so they can be set as reserved ranges and later
      ioremap will be fine with it.
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      77ea8c94
    • D
      x86: Export x86 boot_params to sysfs · 5039e316
      Dave Young 提交于
      kexec-tools use boot_params for getting the 1st kernel hardware_subarch,
      the kexec kernel EFI runtime support also needs to read the old efi_info
      from boot_params. Currently it exists in debugfs which is not a good
      place for such infomation. Per HPA, we should avoid "sploit debugfs".
      
      In this patch /sys/kernel/boot_params are exported, also the setup_data is
      exported as a subdirectory. kexec-tools is using debugfs for hardware_subarch
      for a long time now so we're not removing it yet.
      
      Structure is like below:
      
      /sys/kernel/boot_params
      |__ data                /* boot_params in binary*/
      |__ setup_data
      |   |__ 0               /* the first setup_data node */
      |   |   |__ data        /* setup_data node 0 in binary*/
      |   |   |__ type        /* setup_data type of setup_data node 0, hex string */
      [snip]
      |__ version             /* boot protocal version (in hex, "0x" prefixed)*/
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Acked-by: NBorislav Petkov <bp@suse.de>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      5039e316
    • D
      x86/efi: Pass necessary EFI data for kexec via setup_data · 1fec0533
      Dave Young 提交于
      Add a new setup_data type SETUP_EFI for kexec use.  Passing the saved
      fw_vendor, runtime, config tables and EFI runtime mappings.
      
      When entering virtual mode, directly mapping the EFI runtime regions
      which we passed in previously. And skip the step to call
      SetVirtualAddressMap().
      
      Specially for HP z420 workstation we need save the smbios physical
      address.  The kernel boot sequence proceeds in the following order.
      Step 2 requires efi.smbios to be the physical address.  However, I found
      that on HP z420 EFI system table has a virtual address of SMBIOS in step
      1.  Hence, we need set it back to the physical address with the smbios
      in efi_setup_data.  (When it is still the physical address, it simply
      sets the same value.)
      
      1. efi_init() - Set efi.smbios from EFI system table
      2. dmi_scan_machine() - Temporary map efi.smbios to access SMBIOS table
      3. efi_enter_virtual_mode() - Map EFI ranges
      
      Tested on ovmf+qemu, lenovo thinkpad, a dell laptop and an
      HP z420 workstation.
      Signed-off-by: NDave Young <dyoung@redhat.com>
      Tested-by: NToshi Kani <toshi.kani@hp.com>
      Signed-off-by: NMatt Fleming <matt.fleming@intel.com>
      1fec0533
  13. 21 12月, 2013 1 次提交
  14. 20 12月, 2013 3 次提交
  15. 17 12月, 2013 1 次提交
  16. 12 12月, 2013 1 次提交
  17. 05 12月, 2013 2 次提交
  18. 30 11月, 2013 1 次提交
  19. 27 11月, 2013 2 次提交
    • S
      perf/x86: Add RAPL hrtimer support · 65661f96
      Stephane Eranian 提交于
      The RAPL PMU counters do not interrupt on overflow.
      Therefore, the kernel needs to poll the counters
      to avoid missing an overflow. This patch adds
      the hrtimer code to do this.
      
      The timer interval is calculated at boot time
      based on the power unit used by the HW.
      
      There is one hrtimer per-cpu to handle the case
      of multiple simultaneous use across cores on
      the same package + hotplug CPU.
      
      Thanks to Maria Dimakopoulou for her contributions
      to this patch especially on the math aspects.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      [ Applied 32-bit build fix. ]
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: acme@redhat.com
      Cc: jolsa@redhat.com
      Cc: zheng.z.yan@intel.com
      Cc: bp@alien8.de
      Cc: maria.n.dimakopoulou@gmail.com
      Link: http://lkml.kernel.org/r/1384275531-10892-5-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      65661f96
    • S
      perf/x86: Add Intel RAPL PMU support · 4788e5b4
      Stephane Eranian 提交于
      This patch adds a new uncore PMU to expose the Intel
      RAPL energy consumption counters. Up to 3 counters,
      each counting a particular RAPL event are exposed.
      
      The RAPL counters are available on Intel SandyBridge,
      IvyBridge, Haswell. The server skus add a 3rd counter.
      
      The following events are available and exposed in sysfs:
      
        - power/energy-cores: power consumption of all cores on socket
        - power/energy-pkg: power consumption of all cores + LLc cache
        - power/energy-dram: power consumption of DRAM (servers only)
      
      For each event both the unit (Joules) and scale (2^-32 J)
      is exposed in sysfs for use by perf stat and other tools.
      The files are:
      
      	/sys/devices/power/events/energy-*.unit
      	/sys/devices/power/events/energy-*.scale
      
      The RAPL PMU is uncore by nature and is implemented such
      that it only works in system-wide mode. Measuring only
      one CPU per socket is sufficient. The /sys/devices/power/cpumask
      file can be used by tools to figure out which CPUs to monitor
      by default. For instance, on a 2-socket system, 2 CPUs
      (one on each socket) will be shown.
      
      All the counters measure in the same unit (exposed via sysfs).
      The perf_events API exposes all RAPL counters as 64-bit integers
      counting in unit of 1/2^32 Joules (about 0.23 nJ). User level tools
      must convert the counts by multiplying them by 2^-32 to obtain
      Joules. The reason for this is that the kernel avoids
      doing floating point math whenever possible because it is
      expensive (user floating-point state must be saved). The method
      used avoids kernel floating-point usage. There is no loss of
      precision. Thanks to PeterZ for suggesting this approach.
      
      To convert the raw count in Watt:
         W = C * 2.3 / (1e10 * time)
      or ldexp(C, -32).
      
      RAPL PMU is a new standalone PMU which registers with the
      perf_event core subsystem. The PMU type (attr->type) is
      dynamically allocated and is available from /sys/device/power/type.
      
      Sampling is not supported by the RAPL PMU. There is no
      privilege level filtering either.
      Signed-off-by: NStephane Eranian <eranian@google.com>
      Reviewed-by: NMaria Dimakopoulou <maria.n.dimakopoulou@gmail.com>
      Reviewed-by: NAndi Kleen <ak@linux.intel.com>
      Signed-off-by: NPeter Zijlstra <peterz@infradead.org>
      Cc: acme@redhat.com
      Cc: jolsa@redhat.com
      Cc: zheng.z.yan@intel.com
      Cc: bp@alien8.de
      Link: http://lkml.kernel.org/r/1384275531-10892-4-git-send-email-eranian@google.comSigned-off-by: NIngo Molnar <mingo@kernel.org>
      4788e5b4
  20. 15 11月, 2013 1 次提交
  21. 14 11月, 2013 1 次提交