1. 26 9月, 2017 1 次提交
    • B
      powerpc/powernv: Rework EEH initialization on powernv · b9fde58d
      Benjamin Herrenschmidt 提交于
      Remove the post_init callback which is only used
      by powernv, we can just call it explicitly from
      the powernv code.
      
      This partially kills the ability to "disable" eeh at
      runtime via debugfs as this was calling that same
      callback again, but this is both unused and broken
      in several ways. If we want to revive it, we need
      to create a dedicated enable/disable callback on the
      backend that does the right thing.
      
      Let the bulk of eeh initialize normally at
      core_initcall() like it does on pseries by removing
      the hack in eeh_init() that delays it.
      
      Instead we make sure our eeh->probe cleanly bails
      out of the PEs haven't been created yet and we force
      a re-probe where we used to call eeh_init() again.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Acked-by: NRussell Currey <ruscur@russell.cc>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b9fde58d
  2. 21 9月, 2017 2 次提交
    • T
      powerpc/pseries: Fix parent_dn reference leak in add_dt_node() · b537ca6f
      Tyrel Datwyler 提交于
      A reference to the parent device node is held by add_dt_node() for the
      node to be added. If the call to dlpar_configure_connector() fails
      add_dt_node() returns ENOENT and that reference is not freed.
      
      Add a call to of_node_put(parent_dn) prior to bailing out after a
      failed dlpar_configure_connector() call.
      
      Fixes: 8d5ff320 ("powerpc/pseries: Make dlpar_configure_connector parent node aware")
      Cc: stable@vger.kernel.org # v3.12+
      Signed-off-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b537ca6f
    • T
      powerpc/pseries: Fix "OF: ERROR: Bad of_node_put() on /cpus" during DLPAR · 087ff6a5
      Tyrel Datwyler 提交于
      Commit 215ee763 ("powerpc: pseries: remove dlpar_attach_node
      dependency on full path") reworked dlpar_attach_node() to no longer
      look up the parent node "/cpus", but instead to have the parent node
      passed by the caller in the function parameter list.
      
      As a result dlpar_attach_node() is no longer responsible for freeing
      the reference to the parent node. However, commit 215ee763 failed
      to remove the of_node_put(parent) call in dlpar_attach_node(), or to
      take into account that the reference to the parent in the caller
      dlpar_cpu_add() needs to be held until after dlpar_attach_node()
      returns.
      
      As a result doing repeated cpu add/remove dlpar operations will
      eventually result in the following error:
      
        OF: ERROR: Bad of_node_put() on /cpus
        CPU: 0 PID: 10896 Comm: drmgr Not tainted 4.13.0-autotest #1
        Call Trace:
         dump_stack+0x15c/0x1f8 (unreliable)
         of_node_release+0x1a4/0x1c0
         kobject_put+0x1a8/0x310
         kobject_del+0xbc/0xf0
         __of_detach_node_sysfs+0x144/0x210
         of_detach_node+0xf0/0x180
         dlpar_detach_node+0xc4/0x120
         dlpar_cpu_remove+0x280/0x560
         dlpar_cpu_release+0xbc/0x1b0
         arch_cpu_release+0x6c/0xb0
         cpu_release_store+0xa0/0x100
         dev_attr_store+0x68/0xa0
         sysfs_kf_write+0xa8/0xf0
         kernfs_fop_write+0x2cc/0x400
         __vfs_write+0x5c/0x340
         vfs_write+0x1a8/0x3d0
         SyS_write+0xa8/0x1a0
         system_call+0x58/0x6c
      
      Fix the issue by removing the of_node_put(parent) call from
      dlpar_attach_node(), and ensuring that the reference to the parent
      node is properly held and released by the caller dlpar_cpu_add().
      
      Fixes: 215ee763 ("powerpc: pseries: remove dlpar_attach_node dependency on full path")
      Signed-off-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Reported-by: NAbdul Haleem <abdhalee@linux.vnet.ibm.com>
      [mpe: Add a comment in the code and frob the change log slightly]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      087ff6a5
  3. 20 9月, 2017 1 次提交
  4. 14 9月, 2017 1 次提交
    • M
      mm: treewide: remove GFP_TEMPORARY allocation flag · 0ee931c4
      Michal Hocko 提交于
      GFP_TEMPORARY was introduced by commit e12ba74d ("Group short-lived
      and reclaimable kernel allocations") along with __GFP_RECLAIMABLE.  It's
      primary motivation was to allow users to tell that an allocation is
      short lived and so the allocator can try to place such allocations close
      together and prevent long term fragmentation.  As much as this sounds
      like a reasonable semantic it becomes much less clear when to use the
      highlevel GFP_TEMPORARY allocation flag.  How long is temporary? Can the
      context holding that memory sleep? Can it take locks? It seems there is
      no good answer for those questions.
      
      The current implementation of GFP_TEMPORARY is basically GFP_KERNEL |
      __GFP_RECLAIMABLE which in itself is tricky because basically none of
      the existing caller provide a way to reclaim the allocated memory.  So
      this is rather misleading and hard to evaluate for any benefits.
      
      I have checked some random users and none of them has added the flag
      with a specific justification.  I suspect most of them just copied from
      other existing users and others just thought it might be a good idea to
      use without any measuring.  This suggests that GFP_TEMPORARY just
      motivates for cargo cult usage without any reasoning.
      
      I believe that our gfp flags are quite complex already and especially
      those with highlevel semantic should be clearly defined to prevent from
      confusion and abuse.  Therefore I propose dropping GFP_TEMPORARY and
      replace all existing users to simply use GFP_KERNEL.  Please note that
      SLAB users with shrinkers will still get __GFP_RECLAIMABLE heuristic and
      so they will be placed properly for memory fragmentation prevention.
      
      I can see reasons we might want some gfp flag to reflect shorterm
      allocations but I propose starting from a clear semantic definition and
      only then add users with proper justification.
      
      This was been brought up before LSF this year by Matthew [1] and it
      turned out that GFP_TEMPORARY really doesn't have a clear semantic.  It
      seems to be a heuristic without any measured advantage for most (if not
      all) its current users.  The follow up discussion has revealed that
      opinions on what might be temporary allocation differ a lot between
      developers.  So rather than trying to tweak existing users into a
      semantic which they haven't expected I propose to simply remove the flag
      and start from scratch if we really need a semantic for short term
      allocations.
      
      [1] http://lkml.kernel.org/r/20170118054945.GD18349@bombadil.infradead.org
      
      [akpm@linux-foundation.org: fix typo]
      [akpm@linux-foundation.org: coding-style fixes]
      [sfr@canb.auug.org.au: drm/i915: fix up]
        Link: http://lkml.kernel.org/r/20170816144703.378d4f4d@canb.auug.org.au
      Link: http://lkml.kernel.org/r/20170728091904.14627-1-mhocko@kernel.orgSigned-off-by: NMichal Hocko <mhocko@suse.com>
      Signed-off-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Acked-by: NMel Gorman <mgorman@suse.de>
      Acked-by: NVlastimil Babka <vbabka@suse.cz>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Neil Brown <neilb@suse.de>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      0ee931c4
  5. 02 9月, 2017 1 次提交
    • C
      powerpc/xive: guest exploitation of the XIVE interrupt controller · eac1e731
      Cédric Le Goater 提交于
      This is the framework for using XIVE in a PowerVM guest. The support
      is very similar to the native one in a much simpler form.
      
      Each source is associated with an Event State Buffer (ESB). This is a
      two bit state machine which is used to trigger events. The bits are
      named "P" (pending) and "Q" (queued) and can be controlled by MMIO.
      The Guest OS registers event (or notifications) queues on which the HW
      will post event data for a target to notify.
      
      Instead of OPAL calls, a set of Hypervisors call are used to configure
      the interrupt sources and the event/notification queues of the guest:
      
       - H_INT_GET_SOURCE_INFO
      
         used to obtain the address of the MMIO page of the Event State
         Buffer (PQ bits) entry associated with the source.
      
       - H_INT_SET_SOURCE_CONFIG
      
         assigns a source to a "target".
      
       - H_INT_GET_SOURCE_CONFIG
      
         determines to which "target" and "priority" is assigned to a source
      
       - H_INT_GET_QUEUE_INFO
      
         returns the address of the notification management page associated
         with the specified "target" and "priority".
      
       - H_INT_SET_QUEUE_CONFIG
      
         sets or resets the event queue for a given "target" and "priority".
         It is also used to set the notification config associated with the
         queue, only unconditional notification for the moment.  Reset is
         performed with a queue size of 0 and queueing is disabled in that
         case.
      
       - H_INT_GET_QUEUE_CONFIG
      
         returns the queue settings for a given "target" and "priority".
      
       - H_INT_RESET
      
         resets all of the partition's interrupt exploitation structures to
         their initial state, losing all configuration set via the hcalls
         H_INT_SET_SOURCE_CONFIG and H_INT_SET_QUEUE_CONFIG.
      
       - H_INT_SYNC
      
         issue a synchronisation on a source to make sure sure all
         notifications have reached their queue.
      
      As for XICS, the XIVE interface for the guest is described in the
      device tree under the "interrupt-controller" node. A couple of new
      properties are specific to XIVE :
      
       - "reg"
      
         contains the base address and size of the thread interrupt
         managnement areas (TIMA), also called rings, for the User level and
         for the Guest OS level. Only the Guest OS level is taken into
         account today.
      
       - "ibm,xive-eq-sizes"
      
         the size of the event queues. One cell per size supported, contains
         log2 of size, in ascending order.
      
       - "ibm,xive-lisn-ranges"
      
         the interrupt numbers ranges assigned to the guest. These are
         allocated using a simple bitmap.
      
      and also :
      
       - "/ibm,plat-res-int-priorities"
      
         contains a list of priorities that the hypervisor has reserved for
         its own use.
      
      Tested with a QEMU XIVE model for pseries and with the Power hypervisor.
      Signed-off-by: NCédric Le Goater <clg@kaod.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      eac1e731
  6. 01 9月, 2017 2 次提交
  7. 31 8月, 2017 24 次提交
  8. 28 8月, 2017 1 次提交
  9. 24 8月, 2017 1 次提交
    • R
      powerpc/powernv: Enable removal of memory for in memory tracing · 9d5171a8
      Rashmica Gupta 提交于
      The hardware trace macro feature requires access to a chunk of real
      memory. This patch provides a debugfs interface to do this. By
      writing an integer containing the size of memory to be unplugged into
      /sys/kernel/debug/powerpc/memtrace/enable, the code will attempt to
      remove that much memory from the end of each NUMA node.
      
      This patch also adds additional debugsfs files for each node that
      allows the tracer to interact with the removed memory, as well as
      a trace file that allows userspace to read the generated trace.
      
      Note that this patch does not invoke the hardware trace macro, it
      only allows memory to be removed during runtime for the trace macro
      to utilise.
      Signed-off-by: NRashmica Gupta <rashmica.g@gmail.com>
      [mpe: Minor formatting etc fixups]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9d5171a8
  10. 23 8月, 2017 3 次提交
    • R
      powerpc: pseries: remove dlpar_attach_node dependency on full path · 215ee763
      Rob Herring 提交于
      In preparation to stop storing the full node path in full_name, remove the
      dependency on full_name from dlpar_attach_node(). Callers of
      dlpar_attach_node() already have the parent device_node, so just pass the
      parent node into dlpar_attach_node instead of the path. This avoids doing
      a lookup of the parent node by the path.
      Signed-off-by: NRob Herring <robh@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: linuxppc-dev@lists.ozlabs.org
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      215ee763
    • R
      powerpc: Convert to using %pOF instead of full_name · b7c670d6
      Rob Herring 提交于
      Now that we have a custom printf format specifier, convert users of
      full_name to use %pOF instead. This is preparation to remove storing
      of the full path string for each node.
      Signed-off-by: NRob Herring <robh@kernel.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Anatolij Gustschin <agust@denx.de>
      Cc: Scott Wood <oss@buserror.net>
      Cc: Kumar Gala <galak@kernel.crashing.org>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: linuxppc-dev@lists.ozlabs.org
      Reviewed-by: NTyrel Datwyler <tyreld@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      b7c670d6
    • M
      powerpc/vio: Use device_type to detect family · bcf21e3a
      Michael Ellerman 提交于
      Currently in the vio.c code we use a comparision against the parent
      device node's full path to decide if the device is a PFO or VIO family
      device.
      
      Both the ibm,platform-facilities and vdevice nodes are defined by PAPR,
      and must have a matching device_type. So instead of using the path we
      can instead compare the device_type.
      
      I've checked Qemu and kvmtool both do this correctly, and all the
      PowerVM systems I have access to do also. So it seems to be safe.
      
      This removes the dependency on full_name, which is being removed
      upstream.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bcf21e3a
  11. 17 8月, 2017 1 次提交
  12. 15 8月, 2017 1 次提交
  13. 14 8月, 2017 1 次提交