1. 26 4月, 2017 1 次提交
    • N
      powerpc/64s: Revert setting of LPCR[LPES] on POWER9 · 8bf8f2e8
      Nicholas Piggin 提交于
      The XIVE enablement patches included a change to set the LPES (Logical
      Partitioning Environment Selector) bit (bit # 3) in LPCR (Logical Partitioning
      Control Register) on POWER9 hosts. This bit sets external interrupts to guest
      delivery mode, which uses SRR0/1. The host's EE interrupt handler is written to
      expect HSRR0/1 (for earlier CPUs). This should be fine because XIVE is
      configured not to deliver EEs to the host (Hypervisor Virtulization Interrupt is
      used instead) so the EE handler should never be executed.
      
      However a bug in interrupt controller code, hardware, or odd configuration of a
      simulator could result in the host getting an EE incorrectly. Keeping the EE
      delivery mode matching the host EE handler prevents strange crashes due to using
      the wrong exception registers.
      
      KVM will configure the LPCR to set LPES prior to running a guest so that EEs are
      delivered to the guest using SRR0/1.
      
      Fixes: 08a1e650 ("powerpc: Fixup LPCR:PECE and HEIC setting on POWER9")
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      [mpe: Massage change log to avoid referring to LPES0 which is now renamed LPES]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8bf8f2e8
  2. 10 4月, 2017 6 次提交
  3. 07 4月, 2017 1 次提交
    • B
      powerpc/smp: Remove migrate_irq() custom implementation · a978e139
      Benjamin Herrenschmidt 提交于
      Some powerpc platforms use this to move IRQs away from a CPU being
      unplugged. This function has several bugs such as not taking the right
      locks or failing to NULL check pointers.
      
      There's a new generic function doing exactly the same thing without all
      the bugs, so let's use it instead.
      
      mpe: The obvious place for the select of GENERIC_IRQ_MIGRATION is on
      HOTPLUG_CPU, but that doesn't work. On some configs PM_SLEEP_SMP will
      select HOTPLUG_CPU even though its dependencies are not met, which means
      the select of GENERIC_IRQ_MIGRATION doesn't happen. That leads to the
      build breaking. Fix it by moving the select of GENERIC_IRQ_MIGRATION to
      SMP.
      Signed-off-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a978e139
  4. 06 4月, 2017 3 次提交
  5. 17 3月, 2017 1 次提交
  6. 16 3月, 2017 1 次提交
  7. 11 3月, 2017 1 次提交
    • T
      kexec, x86/purgatory: Unbreak it and clean it up · 40c50c1f
      Thomas Gleixner 提交于
      The purgatory code defines global variables which are referenced via a
      symbol lookup in the kexec code (core and arch).
      
      A recent commit addressing sparse warnings made these static and thereby
      broke kexec_file.
      
      Why did this happen? Simply because the whole machinery is undocumented and
      lacks any form of forward declarations. The variable names are unspecific
      and lack a prefix, so adding forward declarations creates shadow variables
      in the core code. Aside of that the code relies on magic constants and
      duplicate struct definitions with no way to ensure that these things stay
      in sync. The section placement of the purgatory variables happened by
      chance and not by design.
      
      Unbreak kexec and cleanup the mess:
      
       - Add proper forward declarations and document the usage
       - Use common struct definition
       - Use the proper common defines instead of magic constants
       - Add a purgatory_ prefix to have a proper name space
       - Use ARRAY_SIZE() instead of a homebrewn reimplementation
       - Add proper sections to the purgatory variables [ From Mike ]
      
      Fixes: 72042a8c ("x86/purgatory: Make functions and variables static")
      Reported-by: NMike Galbraith <&lt;efault@gmx.de>
      Signed-off-by: NThomas Gleixner <tglx@linutronix.de>
      Cc: Nicholas Mc Guire <der.herr@hofr.at>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: "Tobin C. Harding" <me@tobin.cc>
      Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1703101315140.3681@nanosSigned-off-by: NThomas Gleixner <tglx@linutronix.de>
      40c50c1f
  8. 10 3月, 2017 6 次提交
  9. 09 3月, 2017 5 次提交
    • A
      powerpc/powernv/ioda2: Update iommu table base on ownership change · db08e1d5
      Alexey Kardashevskiy 提交于
      On POWERNV platform, in order to do DMA via IOMMU (i.e. 32bit DMA in
      our case), a device needs an iommu_table pointer set via
      set_iommu_table_base().
      
      The codeflow is:
      - pnv_pci_ioda2_setup_dma_pe()
      	- pnv_pci_ioda2_setup_default_config()
      	- pnv_ioda_setup_bus_dma() [1]
      
      pnv_pci_ioda2_setup_dma_pe() creates IOMMU groups,
      pnv_pci_ioda2_setup_default_config() does default DMA setup,
      pnv_ioda_setup_bus_dma() takes a bus PE (on IODA2, all physical function
      PEs as bus PEs except NPU), walks through all underlying buses and
      devices, adds all devices to an IOMMU group and sets iommu_table.
      
      On IODA2, when VFIO is used, it takes ownership over a PE which means it
      removes all tables and creates new ones (with a possibility of sharing
      them among PEs). So when the ownership is returned from VFIO to
      the kernel, the iommu_table pointer written to a device at [1] is
      stale and needs an update.
      
      This adds an "add_to_group" parameter to pnv_ioda_setup_bus_dma()
      (in fact re-adds as it used to be there a while ago for different
      reasons) to tell the helper if a device needs to be added to
      an IOMMU group with an iommu_table update or just the latter.
      
      This calls pnv_ioda_setup_bus_dma(..., false) from
      pnv_ioda2_release_ownership() so when the ownership is restored,
      32bit DMA can work again for a device. This does the same thing
      on obtaining ownership as the iommu_table point is stale at this point
      anyway and it is safer to have NULL there.
      
      We did not hit this earlier as all tested devices in recent years were
      only using 64bit DMA; the rare exception for this is MPT3 SAS adapter
      which uses both 32bit and 64bit DMA access and it has not been tested
      with VFIO much.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      db08e1d5
    • A
      powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested · 7aafac11
      Alexey Kardashevskiy 提交于
      The IODA2 specification says that a 64 DMA address cannot use top 4 bits
      (3 are reserved and one is a "TVE select"); bottom page_shift bits
      cannot be used for multilevel table addressing either.
      
      The existing IODA2 table allocation code aligns the minimum TCE table
      size to PAGE_SIZE so in the case of 64K system pages and 4K IOMMU pages,
      we have 64-4-12=48 bits. Since 64K page stores 8192 TCEs, i.e. needs
      13 bits, the maximum number of levels is 48/13 = 3 so we physically
      cannot address more and EEH happens on DMA accesses.
      
      This adds a check that too many levels were requested.
      
      It is still possible to have 5 levels in the case of 4K system page size.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Acked-by: NGavin Shan <gwshan@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7aafac11
    • M
      powerpc/perf: Handle sdar_mode for marked event in power9 · 78b4416a
      Madhavan Srinivasan 提交于
      MMCRA[SDAR_MODE] specifices how the SDAR should be updated in
      continous sampling mode. On P9 it must be set to 0b00 when
      MMCRA[63] is set.
      
      Fixes: c7c3f568 ('powerpc/perf: macros for power9 format encoding')
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      78b4416a
    • M
      powerpc/perf: Fix perf_get_data_addr() for power9 DD1 · f04d1080
      Madhavan Srinivasan 提交于
      Power9 DD1 do not support PMU_HAS_SIER flag and sdsync in
      perf_get_data_addr() defaults to MMCRA_SDSYNC which is wrong. Since
      power9 MMCRA does not support SDSYNC bit, patch includes PPMU_NO_SIAR
      flag to the check and set the sdsync with MMCRA_SAMPLE_ENABLE;
      
      Fixes: 27593d72 ("powerpc/perf: Use MSR to report privilege level on P9 DD1")
      Signed-off-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f04d1080
    • J
      axonram: Fix gendisk handling · 672a2c87
      Jan Kara 提交于
      It is invalid to call del_gendisk() when disk->queue is NULL. Fix error
      handling in axon_ram_probe() to avoid doing that.
      
      Also del_gendisk() does not drop a reference to gendisk allocated by
      alloc_disk(). That has to be done by put_disk(). Add that call where
      needed.
      Reported-by: NDan Carpenter <dan.carpenter@oracle.com>
      Signed-off-by: NJan Kara <jack@suse.cz>
      Signed-off-by: NJens Axboe <axboe@fb.com>
      672a2c87
  10. 08 3月, 2017 2 次提交
  11. 06 3月, 2017 6 次提交
    • M
      powerpc: Sort the selects under CONFIG_PPC · a7d2475a
      Michael Ellerman 提交于
      We have a big list of selects under CONFIG_PPC, and currently they're
      completely unsorted. This means people tend to add new selects at the
      bottom of the list, and so two commits which both add a new select will
      often conflict.
      
      Instead sort it alphabetically. This is nicer in and of itself, but also
      means two commits that add a new select will have a greater chance of
      not conflicting.
      
      Add a note at the top and bottom asking people to keep it sorted.
      
      And while we're here pad out the 'if' expressions to make them stand
      out.
      Suggested-by: NStephen Rothwell <sfr@canb.auug.org.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a7d2475a
    • M
      powerpc/64: Fix L1D cache shape vector reporting L1I values · 9c7a0086
      Michael Ellerman 提交于
      It seems we didn't pay quite enough attention when testing the new cache
      shape vectors, which means we didn't notice the bug where the vector for
      the L1D was using the L1I values. Fix it, resulting in eg:
      
        L1I  cache size:     0x8000      32768B         32K
        L1I  line size:        0x80       8-way associative
        L1D  cache size:    0x10000      65536B         64K
        L1D  line size:        0x80       8-way associative
      
      Fixes: 98a5f361 ("powerpc: Add new cache geometry aux vectors")
      Cut-and-paste-bug-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Badly-reviewed-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9c7a0086
    • A
      powerpc/64: Avoid panic during boot due to divide by zero in init_cache_info() · 6ba422c7
      Anton Blanchard 提交于
      I see a panic in early boot when building with a recent gcc toolchain.
      The issue is a divide by zero, which is undefined. Older toolchains
      let us get away with it:
      
      int foo(int a) { return a / 0; }
      
      foo:
      	li 9,0
      	divw 3,3,9
      	extsw 3,3
      	blr
      
      But newer ones catch it:
      
      foo:
      	trap
      
      Add a check to avoid the divide by zero.
      
      Fixes: e2827fe5 ("powerpc/64: Clean up ppc64_caches using a struct per cache")
      Signed-off-by: NAnton Blanchard <anton@samba.org>
      Acked-by: NBenjamin Herrenschmidt <benh@kernel.crashing.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6ba422c7
    • S
      powerpc: Update to new option-vector-5 format for CAS · 014d02cb
      Suraj Jitindar Singh 提交于
      On POWER9 the ibm,client-architecture-support (CAS) negotiation process
      has been updated to change how the host to guest negotiation is done for
      the new hash/radix mmu as well as the nest mmu, process tables and guest
      translation shootdown (GTSE).
      
      This is documented in the unreleased PAPR ACR "CAS option vector
      additions for P9".
      
      The host tells the guest which options it supports in
      ibm,arch-vec-5-platform-support. The guest then chooses a subset of these
      to request in the CAS call and these are agreed to in the
      ibm,architecture-vec-5 property of the chosen node.
      
      Thus we read ibm,arch-vec-5-platform-support and make our selection before
      calling CAS. We then parse the ibm,architecture-vec-5 property of the
      chosen node to check whether we should run as hash or radix.
      
      ibm,arch-vec-5-platform-support format:
      
      index value pairs: <index, val> ... <index, val>
      
      index: Option vector 5 byte number
      val:   Some representation of supported values
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Acked-by: NPaul Mackerras <paulus@ozlabs.org>
      [mpe: Don't print about unknown options, be consistent with OV5_FEAT]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      014d02cb
    • S
      powerpc: Parse the command line before calling CAS · 12cc9fd6
      Suraj Jitindar Singh 提交于
      On POWER9 the hypervisor requires the guest to decide whether it would
      like to use a hash or radix mmu model at the time it calls
      ibm,client-architecture-support (CAS) based on what the hypervisor has
      said it's allowed to do. It is possible to disable radix by passing
      "disable_radix" on the command line. The next patch will add support for
      the new CAS format, thus we need to parse the command line before calling
      CAS so we can correctly select which mmu we would like to use.
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Reviewed-by: NPaul Mackerras <paulus@ozlabs.org>
      Acked-by: NBalbir Singh <bsingharora@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      12cc9fd6
    • B
      powerpc/xics: Work around limitations of OPAL XICS priority handling · a69e2fb7
      Balbir Singh 提交于
      The CPPR (Current Processor Priority Register) of a XICS interrupt
      presentation controller contains a value N, such that only interrupts
      with a priority "more favoured" than N will be received by the CPU,
      where "more favoured" means "less than". So if the CPPR has the value 5
      then only interrupts with a priority of 0-4 inclusive will be received.
      
      In theory the CPPR can support a value of 0 to 255 inclusive.
      In practice Linux only uses values of 0, 4, 5 and 0xff. Setting the CPPR
      to 0 rejects all interrupts, setting it to 0xff allows all interrupts.
      The values 4 and 5 are used to differentiate IPIs from external
      interrupts. Setting the CPPR to 5 allows IPIs to be received but not
      external interrupts.
      
      The CPPR emulation in the OPAL XICS implementation only directly
      supports priorities 0 and 0xff. All other priorities are considered
      equivalent, and mapped to a single priority value internally. This means
      when using icp-opal we can not allow IPIs but not externals.
      
      This breaks Linux's use of priority values when a CPU is hot unplugged.
      After migrating IRQs away from the CPU that is being offlined, we set
      the priority to 5, meaning we still want the offline CPU to receive
      IPIs. But the effect of the OPAL XICS emulation's use of a single
      priority value is that all interrupts are rejected by the CPU. With the
      CPU offline, and not receiving IPIs, we may not be able to wake it up to
      bring it back online.
      
      The first part of the fix is in icp_opal_set_cpu_priority(). CPPR values
      of 0 to 4 inclusive will correctly cause all interrupts to be rejected,
      so we pass those CPPR values through to OPAL. However if we are called
      with a CPPR of 5 or greater, the caller is expecting to be able to allow
      IPIs but not external interrupts. We know this doesn't work, so instead
      of rejecting all interrupts we choose the opposite which is to allow all
      interrupts. This is still not correct behaviour, but we know for the
      only existing caller (xics_migrate_irqs_away()), that it is the better
      option.
      
      The other part of the fix is in xics_migrate_irqs_away(). Instead of
      setting priority (CPPR) to 0, and then back to 5 before migrating IRQs,
      we migrate the IRQs before setting the priority back to 5. This should
      have no effect on an ICP backend with a working set_priority(), and on
      icp-opal it means we will keep all interrupts blocked until after we've
      finished doing the IRQ migration. Additionally we wait for 5ms after
      doing the migration to make sure there are no IRQs in flight.
      
      Fixes: d7436188 ("powerpc/xics: Add ICP OPAL backend")
      Cc: stable@vger.kernel.org # v4.8+
      Suggested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Reported-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Tested-by: NVaidyanathan Srinivasan <svaidy@linux.vnet.ibm.com>
      Signed-off-by: NBalbir Singh <bsingharora@gmail.com>
      [mpe: Rewrote comments and change log, change delay to 5ms]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a69e2fb7
  12. 04 3月, 2017 2 次提交
  13. 03 3月, 2017 5 次提交
    • I
      sched/headers: Move task->mm handling methods to <linux/sched/mm.h> · 68e21be2
      Ingo Molnar 提交于
      Move the following task->mm helper APIs into a new header file,
      <linux/sched/mm.h>, to further reduce the size and complexity
      of <linux/sched.h>.
      
      Here are how the APIs are used in various kernel files:
      
        # mm_alloc():
        arch/arm/mach-rpc/ecard.c
        fs/exec.c
        include/linux/sched/mm.h
        kernel/fork.c
      
        # __mmdrop():
        arch/arc/include/asm/mmu_context.h
        include/linux/sched/mm.h
        kernel/fork.c
      
        # mmdrop():
        arch/arm/mach-rpc/ecard.c
        arch/m68k/sun3/mmu_emu.c
        arch/x86/mm/tlb.c
        drivers/gpu/drm/amd/amdkfd/kfd_process.c
        drivers/gpu/drm/i915/i915_gem_userptr.c
        drivers/infiniband/hw/hfi1/file_ops.c
        drivers/vfio/vfio_iommu_spapr_tce.c
        fs/exec.c
        fs/proc/base.c
        fs/proc/task_mmu.c
        fs/proc/task_nommu.c
        fs/userfaultfd.c
        include/linux/mmu_notifier.h
        include/linux/sched/mm.h
        kernel/fork.c
        kernel/futex.c
        kernel/sched/core.c
        mm/khugepaged.c
        mm/ksm.c
        mm/mmu_context.c
        mm/mmu_notifier.c
        mm/oom_kill.c
        virt/kvm/kvm_main.c
      
        # mmdrop_async_fn():
        include/linux/sched/mm.h
      
        # mmdrop_async():
        include/linux/sched/mm.h
        kernel/fork.c
      
        # mmget_not_zero():
        fs/userfaultfd.c
        include/linux/sched/mm.h
        mm/oom_kill.c
      
        # mmput():
        arch/arc/include/asm/mmu_context.h
        arch/arc/kernel/troubleshoot.c
        arch/frv/mm/mmu-context.c
        arch/powerpc/platforms/cell/spufs/context.c
        arch/sparc/include/asm/mmu_context_32.h
        drivers/android/binder.c
        drivers/gpu/drm/etnaviv/etnaviv_gem.c
        drivers/gpu/drm/i915/i915_gem_userptr.c
        drivers/infiniband/core/umem.c
        drivers/infiniband/core/umem_odp.c
        drivers/infiniband/core/uverbs_main.c
        drivers/infiniband/hw/mlx4/main.c
        drivers/infiniband/hw/mlx5/main.c
        drivers/infiniband/hw/usnic/usnic_uiom.c
        drivers/iommu/amd_iommu_v2.c
        drivers/iommu/intel-svm.c
        drivers/lguest/lguest_user.c
        drivers/misc/cxl/fault.c
        drivers/misc/mic/scif/scif_rma.c
        drivers/oprofile/buffer_sync.c
        drivers/vfio/vfio_iommu_type1.c
        drivers/vhost/vhost.c
        drivers/xen/gntdev.c
        fs/exec.c
        fs/proc/array.c
        fs/proc/base.c
        fs/proc/task_mmu.c
        fs/proc/task_nommu.c
        fs/userfaultfd.c
        include/linux/sched/mm.h
        kernel/cpuset.c
        kernel/events/core.c
        kernel/events/uprobes.c
        kernel/exit.c
        kernel/fork.c
        kernel/ptrace.c
        kernel/sys.c
        kernel/trace/trace_output.c
        kernel/tsacct.c
        mm/memcontrol.c
        mm/memory.c
        mm/mempolicy.c
        mm/migrate.c
        mm/mmu_notifier.c
        mm/nommu.c
        mm/oom_kill.c
        mm/process_vm_access.c
        mm/rmap.c
        mm/swapfile.c
        mm/util.c
        virt/kvm/async_pf.c
      
        # mmput_async():
        include/linux/sched/mm.h
        kernel/fork.c
        mm/oom_kill.c
      
        # get_task_mm():
        arch/arc/kernel/troubleshoot.c
        arch/powerpc/platforms/cell/spufs/context.c
        drivers/android/binder.c
        drivers/gpu/drm/etnaviv/etnaviv_gem.c
        drivers/infiniband/core/umem.c
        drivers/infiniband/core/umem_odp.c
        drivers/infiniband/hw/mlx4/main.c
        drivers/infiniband/hw/mlx5/main.c
        drivers/infiniband/hw/usnic/usnic_uiom.c
        drivers/iommu/amd_iommu_v2.c
        drivers/iommu/intel-svm.c
        drivers/lguest/lguest_user.c
        drivers/misc/cxl/fault.c
        drivers/misc/mic/scif/scif_rma.c
        drivers/oprofile/buffer_sync.c
        drivers/vfio/vfio_iommu_type1.c
        drivers/vhost/vhost.c
        drivers/xen/gntdev.c
        fs/proc/array.c
        fs/proc/base.c
        fs/proc/task_mmu.c
        include/linux/sched/mm.h
        kernel/cpuset.c
        kernel/events/core.c
        kernel/exit.c
        kernel/fork.c
        kernel/ptrace.c
        kernel/sys.c
        kernel/trace/trace_output.c
        kernel/tsacct.c
        mm/memcontrol.c
        mm/memory.c
        mm/mempolicy.c
        mm/migrate.c
        mm/mmu_notifier.c
        mm/nommu.c
        mm/util.c
      
        # mm_access():
        fs/proc/base.c
        include/linux/sched/mm.h
        kernel/fork.c
        mm/process_vm_access.c
      
        # mm_release():
        arch/arc/include/asm/mmu_context.h
        fs/exec.c
        include/linux/sched/mm.h
        include/uapi/linux/sched.h
        kernel/exit.c
        kernel/fork.c
      Acked-by: NLinus Torvalds <torvalds@linux-foundation.org>
      Cc: Mike Galbraith <efault@gmx.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: NIngo Molnar <mingo@kernel.org>
      68e21be2
    • L
      powerpc/booke: Fix boot crash due to null hugepd · 3fb66a70
      Laurentiu Tudor 提交于
      On 32-bit book-e machines, hugepd_ok() no longer takes into account null
      hugepd values, causing this crash at boot:
      
        Unable to handle kernel paging request for data at address 0x80000000
        ...
        NIP [c0018378] follow_huge_addr+0x38/0xf0
        LR [c001836c] follow_huge_addr+0x2c/0xf0
        Call Trace:
         follow_huge_addr+0x2c/0xf0 (unreliable)
         follow_page_mask+0x40/0x3e0
         __get_user_pages+0xc8/0x450
         get_user_pages_remote+0x8c/0x250
         copy_strings+0x110/0x390
         copy_strings_kernel+0x2c/0x50
         do_execveat_common+0x478/0x630
         do_execve+0x2c/0x40
         try_to_run_init_process+0x18/0x60
         kernel_init+0xbc/0x110
         ret_from_kernel_thread+0x5c/0x64
      
      This impacts all nxp (ex-freescale) 32-bit booke platforms.
      
      This was caused by the change of hugepd_t.pd from signed to unsigned,
      and the update to the nohash version of hugepd_ok(). Previously
      hugepd_ok() could exclude all non-huge and NULL pgds using > 0, whereas
      now we need to explicitly check that the value is not zero and also that
      PD_HUGE is *clear*.
      
      This isn't protected by the pgd_none() check in __find_linux_pte_or_hugepte()
      because on 32-bit we use pgtable-nopud.h, which causes the pgd_none()
      check to be always false.
      
      Fixes: 20717e1f ("powerpc/mm: Fix little-endian 4K hugetlb")
      Cc: stable@vger.kernel.org # v4.7+
      Reported-by: NMadalin-Cristian Bucur <madalin.bucur@nxp.com>
      Signed-off-by: NLaurentiu Tudor <laurentiu.tudor@nxp.com>
      [mpe: Flesh out change log details.]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      3fb66a70
    • N
      powerpc: Fix compiling a BE kernel with a powerpc64le toolchain · 4dc831aa
      Nicholas Piggin 提交于
      GCC can compile with either endian, but the default ABI version is set
      based on the default endianness of the toolchain. Alan Modra says:
      
        you need both -mbig and -mabi=elfv1 to make a powerpc64le gcc
        generate powerpc64 code
      
      The opposite is true for powerpc64 when generating -mlittle it
      requires -mabi=elfv2 to generate v2 ABI, which we were already doing.
      
      This change adds ABI annotations together with endianness for all cases,
      LE and BE. This fixes the case of building a BE kernel with a toolchain
      that is LE by default.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Tested-by: NNaveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4dc831aa
    • G
      powerpc/powernv: Fix bug due to labeling ambiguity in power_enter_stop · 424f8acd
      Gautham R. Shenoy 提交于
      Commit 09206b60 ("powernv: Pass PSSCR value and mask to
      power9_idle_stop") added additional code in power_enter_stop() to
      distinguish between stop requests whose PSSCR had ESL=EC=1 from those
      which did not. When ESL=EC=1, we do a forward-jump to a location
      labelled by "1", which had the code to handle the ESL=EC=1 case.
      
      Unfortunately just a couple of instructions before this label, is the
      macro IDLE_STATE_ENTER_SEQ() which also has a label "1" in its
      expansion.
      
      As a result, the current code can result in directly executing stop
      instruction for deep stop requests with PSSCR ESL=EC=1, without saving
      the hypervisor state.
      
      Fix this BUG by labeling the location that handles ESL=EC=1 case with
      a more descriptive label ".Lhandle_esl_ec_set" (local label suggestion
      a la .Lxx from Anton Blanchard).
      
      While at it, rename the label "2" labelling the location of the code
      handling entry into deep stop states with ".Lhandle_deep_stop".
      
      For a good measure, change the label in IDLE_STATE_ENTER_SEQ() macro
      to an not-so commonly used value in order to avoid similar mishaps in
      the future.
      
      Fixes: 09206b60 ("powernv: Pass PSSCR value and mask to power9_idle_stop")
      Signed-off-by: NGautham R. Shenoy <ego@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      424f8acd
    • P
      powerpc/64: Invalidate process table caching after setting process table · 7a70d728
      Paul Mackerras 提交于
      The POWER9 MMU reads and caches entries from the process table.
      When we kexec from one kernel to another, the second kernel sets
      its process table pointer but doesn't currently do anything to
      make the CPU invalidate any cached entries from the old process table.
      This adds a tlbie (TLB invalidate entry) instruction with parameters
      to invalidate caching of the process table after the new process
      table is installed.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7a70d728