1. 05 8月, 2019 1 次提交
  2. 26 2月, 2019 1 次提交
    • N
      powerpc/powernv: move OPAL call wrapper tracing and interrupt handling to C · 75d9fc7f
      Nicholas Piggin 提交于
      The OPAL call wrapper gets interrupt disabling wrong. It disables
      interrupts just by clearing MSR[EE], which has two problems:
      
      - It doesn't call into the IRQ tracing subsystem, which means tracing
        across OPAL calls does not always notice IRQs have been disabled.
      
      - It doesn't go through the IRQ soft-mask code, which causes a minor
        bug. MSR[EE] can not be restored by saving the MSR then clearing
        MSR[EE], because a racing interrupt while soft-masked could clear
        MSR[EE] between the two steps. This can cause MSR[EE] to be
        incorrectly enabled when the OPAL call returns. Fortunately that
        should only result in another masked interrupt being taken to
        disable MSR[EE] again, but it's a bit sloppy.
      
      The existing code also saves MSR to PACA, which is not re-entrant if
      there is a nested OPAL call from different MSR contexts, which can
      happen these days with SRESET interrupts on bare metal.
      
      To fix these issues, move the tracing and IRQ handling code to C, and
      call into asm just for the low level call when everything is ready to
      go. Save the MSR on stack rather than PACA.
      
      Performance cost is kept to a minimum with a few optimisations:
      
      - The endian switch upon return is combined with the MSR restore,
        which avoids an expensive context synchronizing operation for LE
        kernels. This makes up for the additional mtmsrd to enable
        interrupts with local_irq_enable().
      
      - blr is now used to return from the opal_* functions that are called
        as C functions, to avoid link stack corruption. This requires a
        skiboot fix as well to keep the call stack balanced.
      
      A NULL call is more costly after this, (410ns->430ns on POWER9), but
      OPAL calls are generally not performance critical at this scale.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      75d9fc7f
  3. 16 7月, 2018 1 次提交
  4. 13 3月, 2018 1 次提交
  5. 24 1月, 2018 1 次提交
  6. 12 11月, 2017 1 次提交
    • S
      powerpc/vas: Export HVWC to debugfs · ece4e512
      Sukadev Bhattiprolu 提交于
      Export the VAS Window context information to debugfs.
      
      We need to hold a mutex when closing the window to prevent a race
      with the debugfs read(). Rather than introduce a per-instance mutex,
      we use the global vas_mutex for now, since it is not heavily contended.
      
      The window->cop field is only relevant to a receive window so we were
      not setting it for a send window (which is is paired to a receive window
      anyway). But to simplify reporting in debugfs, set the 'cop' field for the
      send window also.
      Signed-off-by: NSukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      ece4e512
  7. 02 11月, 2017 1 次提交
    • G
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman 提交于
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: NKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: NPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: NThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: NGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  8. 31 8月, 2017 1 次提交
  9. 24 8月, 2017 1 次提交
    • R
      powerpc/powernv: Enable removal of memory for in memory tracing · 9d5171a8
      Rashmica Gupta 提交于
      The hardware trace macro feature requires access to a chunk of real
      memory. This patch provides a debugfs interface to do this. By
      writing an integer containing the size of memory to be unplugged into
      /sys/kernel/debug/powerpc/memtrace/enable, the code will attempt to
      remove that much memory from the end of each NUMA node.
      
      This patch also adds additional debugsfs files for each node that
      allows the tracer to interact with the removed memory, as well as
      a trace file that allows userspace to read the generated trace.
      
      Note that this patch does not invoke the hardware trace macro, it
      only allows memory to be removed during runtime for the trace macro
      to utilise.
      Signed-off-by: NRashmica Gupta <rashmica.g@gmail.com>
      [mpe: Minor formatting etc fixups]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9d5171a8
  10. 10 8月, 2017 3 次提交
  11. 25 7月, 2017 1 次提交
    • M
      powerpc/powernv: Detect and create IMC device · 8f95faaa
      Madhavan Srinivasan 提交于
      Code to create platform device for the In-Memory Collection (IMC)
      counters. Platform devices are created based on the IMC compatibility.
      New header file created to contain the data structures and macros
      needed for In-Memory Collection (IMC) counter pmu devices.
      
      The device tree for IMC counters starts at the node "imc-counters".
      This node contains all the IMC PMU nodes and event nodes for these IMC
      PMUs. Device probe() parses the device to locate three possible IMC
      device types (Nest/Core/Thread). Function then branch to parse each
      unit nodes to populate vital information such as device memory sizes,
      event nodes information, base address for reserve memory access (if
      any) and so on. Simple bare-minimum shutdown function added which only
      "stops" the engines.
      Signed-off-by: NAnju T Sudhakar <anju@linux.vnet.ibm.com>
      Signed-off-by: NHemant Kumar <hemant@linux.vnet.ibm.com>
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      [mpe: Fix build with CONFIG_PERF_EVENTS=n]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8f95faaa
  12. 14 7月, 2016 1 次提交
  13. 08 2月, 2016 1 次提交
  14. 27 12月, 2015 1 次提交
    • R
      powerpc/powernv: Add a kmsg_dumper that flushes console output on panic · affddff6
      Russell Currey 提交于
      On BMC machines, console output is controlled by the OPAL firmware and is
      only flushed when its pollers are called.  When the kernel is in a panic
      state, it no longer calls these pollers and thus console output does not
      completely flush, causing some output from the panic to be lost.
      
      Output is only actually lost when the kernel is configured to not power off
      or reboot after panic (i.e. CONFIG_PANIC_TIMEOUT is set to 0) since OPAL
      flushes the console buffer as part of its power down routines.  Before this
      patch, however, only partial output would be printed during the timeout wait.
      
      This patch adds a new kmsg_dumper which gets called at panic time to ensure
      panic output is not lost.  It accomplishes this by calling OPAL_CONSOLE_FLUSH
      in the OPAL API, and if that is not available, the pollers are called enough
      times to (hopefully) completely flush the buffer.
      
      The flushing mechanism will only affect output printed at and before the
      kmsg_dump call in kernel/panic.c:panic().  As such, the "end Kernel panic"
      message may still be truncated as follows:
      
      >Call Trace:
      >[c000000f1f603b00] [c0000000008e9458] dump_stack+0x90/0xbc (unreliable)
      >[c000000f1f603b30] [c0000000008e7e78] panic+0xf8/0x2c4
      >[c000000f1f603bc0] [c000000000be4860] mount_block_root+0x288/0x33c
      >[c000000f1f603c80] [c000000000be4d14] prepare_namespace+0x1f4/0x254
      >[c000000f1f603d00] [c000000000be43e8] kernel_init_freeable+0x318/0x350
      >[c000000f1f603dc0] [c00000000000bd74] kernel_init+0x24/0x130
      >[c000000f1f603e30] [c0000000000095b0] ret_from_kernel_thread+0x5c/0xac
      >---[ end Kernel panic - not
      
      This functionality is implemented as a kmsg_dumper as it seems to be the
      most sensible way to introduce platform-specific functionality to the
      panic function.
      Signed-off-by: NRussell Currey <ruscur@russell.cc>
      Reviewed-by: NAndrew Donnellan <andrew.donnellan@au1.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      affddff6
  15. 17 12月, 2015 1 次提交
    • A
      powerpc/powernv: Add support for Nvlink NPUs · 5d2aa710
      Alistair Popple 提交于
      NVLink is a high speed interconnect that is used in conjunction with a
      PCI-E connection to create an interface between CPU and GPU that
      provides very high data bandwidth. A PCI-E connection to a GPU is used
      as the control path to initiate and report status of large data
      transfers sent via the NVLink.
      
      On IBM Power systems the NVLink processing unit (NPU) is similar to
      the existing PHB3. This patch adds support for a new NPU PHB type. DMA
      operations on the NPU are not supported as this patch sets the TCE
      translation tables to be the same as the related GPU PCIe device for
      each NVLink. Therefore all DMA operations are setup and controlled via
      the PCIe device.
      
      EEH is not presently supported for the NPU devices, although it may be
      added in future.
      Signed-off-by: NAlistair Popple <alistair@popple.id.au>
      Signed-off-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      5d2aa710
  16. 05 6月, 2015 1 次提交
  17. 22 5月, 2015 2 次提交
  18. 17 3月, 2015 1 次提交
  19. 04 2月, 2015 1 次提交
  20. 05 8月, 2014 1 次提交
  21. 11 7月, 2014 1 次提交
    • A
      powernv: Add OPAL tracepoints · c49f6353
      Anton Blanchard 提交于
      Knowing how long we spend in firmware calls is an important part of
      minimising OS jitter.
      
      This patch adds tracepoints to each OPAL call. If tracepoints are
      enabled we branch out to a common routine that calls an entry and exit
      tracepoint.
      
      This allows us to write tools that monitor the frequency and duration
      of OPAL calls, eg:
      
      name                  count  total(ms)  min(ms)  max(ms)  avg(ms)  period(ms)
      OPAL_HANDLE_INTERRUPT     5      0.199    0.037    0.042    0.040   12547.545
      OPAL_POLL_EVENTS        204      2.590    0.012    0.036    0.013    2264.899
      OPAL_PCI_MSI_EOI       2830      3.066    0.001    0.005    0.001      81.166
      
      We use jump labels if configured, which means we only add a single
      nop instruction to every OPAL call when the tracepoints are disabled.
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NPaul E. McKenney <paulmck@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c49f6353
  22. 25 6月, 2014 1 次提交
    • M
      powerpc/powernv: Remove OPAL v1 takeover · e2500be2
      Michael Ellerman 提交于
      In commit 27f44888 "Add OPAL takeover from PowerVM" we added support
      for "takeover" on OPAL v1 machines.
      
      This was a mode of operation where we would boot under pHyp, and query
      for the presence of OPAL. If detected we would then do a special
      sequence to take over the machine, and the kernel would end up running
      in hypervisor mode.
      
      OPAL v1 was never a supported product, and was never shipped outside
      IBM. As far as we know no one is still using it.
      
      Newer versions of OPAL do not use the takeover mechanism. Although the
      query for OPAL should be harmless on machines with newer OPAL, we have
      seen a machine where it causes a crash in Open Firmware.
      
      The code in early_init_devtree() to copy boot_command_line into cmd_line
      was added in commit 817c21ad "Get kernel command line accross OPAL
      takeover", and AFAIK is only used by takeover, so should also be
      removed.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e2500be2
  23. 11 6月, 2014 1 次提交
  24. 28 5月, 2014 1 次提交
    • M
      powerpc/powernv: Add support for POWER8 split core on powernv · e2186023
      Michael Ellerman 提交于
      Upcoming POWER8 chips support a concept called split core. This is where the
      core can be split into subcores that although not full cores, are able to
      appear as full cores to a guest.
      
      The splitting & unsplitting procedure is mildly complicated, and explained at
      length in the comments within the patch.
      
      One notable detail is that when splitting or unsplitting we need to pull
      offline cpus out of their offline state to do work as part of the procedure.
      
      The interface for changing the split mode is via a sysfs file, eg:
      
       $ echo 2 > /sys/devices/system/cpu/subcores_per_core
      
      Currently supported values are '1', '2' and '4'. And indicate respectively that
      the core should be unsplit, split in half, and split in quarters. These modes
      correspond to threads_per_subcore of 8, 4 and 2.
      
      We do not allow changing the split mode while KVM VMs are active. This is to
      prevent the value changing while userspace is configuring the VM, and also to
      prevent the mode being changed in such a way that existing guests are unable to
      be run.
      
      CPU hotplug fixes by Srivatsa.  max_cpus fixes by Mahesh.  cpuset fixes by
      benh.  Fix for irq race by paulus.  The rest by mikey and mpe.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NMichael Neuling <mikey@neuling.org>
      Signed-off-by: NSrivatsa S. Bhat <srivatsa.bhat@linux.vnet.ibm.com>
      Signed-off-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      e2186023
  25. 09 4月, 2014 1 次提交
  26. 24 3月, 2014 3 次提交
  27. 07 3月, 2014 2 次提交
    • S
      powerpc/powernv Platform dump interface · c7e64b9c
      Stewart Smith 提交于
      This enables support for userspace to fetch and initiate FSP and
      Platform dumps from the service processor (via firmware) through sysfs.
      
      Based on original patch from Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
      
      Flow:
        - We register for OPAL notification events.
        - OPAL sends new dump available notification.
        - We make information on dump available via sysfs
        - Userspace requests dump contents
        - We retrieve the dump via OPAL interface
        - User copies the dump data
        - userspace sends ack for dump
        - We send ACK to OPAL.
      
      sysfs files:
        - We add the /sys/firmware/opal/dump directory
        - echoing 1 (well, anything, but in future we may support
          different dump types) to /sys/firmware/opal/dump/initiate_dump
          will initiate a dump.
        - Each dump that we've been notified of gets a directory
          in /sys/firmware/opal/dump/ with a name of the dump type and ID (in hex,
          as this is what's used elsewhere to identify the dump).
        - Each dump has files: id, type, dump and acknowledge
          dump is binary and is the dump itself.
          echoing 'ack' to acknowledge (currently any string will do) will
          acknowledge the dump and it will soon after disappear from sysfs.
      
      OPAL APIs:
        - opal_dump_init()
        - opal_dump_info()
        - opal_dump_read()
        - opal_dump_ack()
        - opal_dump_resend_notification()
      
      Currently we are only ever notified for one dump at a time (until
      the user explicitly acks the current dump, then we get a notification
      of the next dump), but this kernel code should "just work" when OPAL
      starts notifying us of all the dumps present.
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      c7e64b9c
    • S
      powerpc/powernv: Read OPAL error log and export it through sysfs · 774fea1a
      Stewart Smith 提交于
      Based on a patch by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      
      This patch adds support to read error logs from OPAL and export
      them to userspace through a sysfs interface.
      
      We export each log entry as a directory in /sys/firmware/opal/elog/
      
      Currently, OPAL will buffer up to 128 error log records, we don't
      need to have any knowledge of this limit on the Linux side as that
      is actually largely transparent to us.
      
      Each error log entry has the following files: id, type, acknowledge, raw.
      Currently we just export the raw binary error log in the 'raw' attribute.
      In a future patch, we may parse more of the error log to make it a bit
      easier for userspace (e.g. to be able to display a brief summary in
      petitboot without having to have a full parser).
      
      If we have >128 logs from OPAL, we'll only be notified of 128 until
      userspace starts acknowledging them. This limitation may be lifted in
      the future and with this patch, that should "just work" from the linux side.
      
      A userspace daemon should:
      - wait for error log entries using normal mechanisms (we announce creation)
      - read error log entry
      - save error log entry safely to disk
      - acknowledge the error log entry
      - rinse, repeat.
      
      On the Linux side, we read the error log when we're notified of it. This
      possibly isn't ideal as it would be better to only read them on-demand.
      However, this doesn't really work with current OPAL interface, so we
      read the error log immediately when notified at the moment.
      
      I've tested this pretty extensively and am rather confident that the
      linux side of things works rather well. There is currently an issue with
      the service processor side of things for >128 error logs though.
      Signed-off-by: NStewart Smith <stewart@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      774fea1a
  28. 09 12月, 2013 1 次提交
  29. 30 10月, 2013 1 次提交
    • V
      powerpc/powernv: Code update interface · 50bd6153
      Vasant Hegde 提交于
      Code update interface for powernv platform. This provides
      sysfs interface to pass new image, validate, update and
      commit images.
      
      This patch includes:
        - Below OPAL APIs for code update
          - opal_validate_flash()
          - opal_manage_flash()
          - opal_update_flash()
      
        - Create below sysfs files under /sys/firmware/opal
          - image		: Interface to pass new FW image
          - validate_flash	: Validate candidate image
          - manage_flash	: Commit/Reject operations
          - update_flash	: Flash new candidate image
      
      Updating Image:
        "update_flash" is an interface to indicate flash new FW.
      It just passes image SG list to FW. Actual flashing is done
      during system reboot time.
      
      Note:
        - SG entry format:
          I have kept version number to keep this list similar to what
          PAPR is defined.
      Signed-off-by: NVasant Hegde <hegdevasant@linux.vnet.ibm.com>
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      50bd6153
  30. 11 10月, 2013 2 次提交
  31. 14 8月, 2013 1 次提交
  32. 20 6月, 2013 2 次提交