1. 09 10月, 2018 23 次提交
    • P
      KVM: PPC: Book3S HV: Add a VM capability to enable nested virtualization · aa069a99
      Paul Mackerras 提交于
      With this, userspace can enable a KVM-HV guest to run nested guests
      under it.
      
      The administrator can control whether any nested guests can be run;
      setting the "nested" module parameter to false prevents any guests
      becoming nested hypervisors (that is, any attempt to enable the nested
      capability on a guest will fail).  Guests which are already nested
      hypervisors will continue to be so.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      aa069a99
    • P
      KVM: PPC: Book3S HV: Add nested shadow page tables to debugfs · 83a05510
      Paul Mackerras 提交于
      This adds a list of valid shadow PTEs for each nested guest to
      the 'radix' file for the guest in debugfs.  This can be useful for
      debugging.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      83a05510
    • S
      KVM: PPC: Book3S HV: Sanitise hv_regs on nested guest entry · 73937deb
      Suraj Jitindar Singh 提交于
      restore_hv_regs() is used to copy the hv_regs L1 wants to set to run the
      nested (L2) guest into the vcpu structure. We need to sanitise these
      values to ensure we don't let the L1 guest hypervisor do things we don't
      want it to.
      
      We don't let data address watchpoints or completed instruction address
      breakpoints be set to match in hypervisor state.
      
      We also don't let L1 enable features in the hypervisor facility status
      and control register (HFSCR) for L2 which we have disabled for L1. That
      is L2 will get the subset of features which the L0 hypervisor has
      enabled for L1 and the features L1 wants to enable for L2. This could
      mean we give L1 a hypervisor facility unavailable interrupt for a
      facility it thinks it has enabled, however it shouldn't have enabled a
      facility it itself doesn't have for the L2 guest.
      
      We sanitise the registers when copying in the L2 hv_regs. We don't need
      to sanitise when copying back the L1 hv_regs since these shouldn't be
      able to contain invalid values as they're just what was copied out.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      73937deb
    • P
      KVM: PPC: Book3S HV: Add one-reg interface to virtual PTCR register · 30323418
      Paul Mackerras 提交于
      This adds a one-reg register identifier which can be used to read and
      set the virtual PTCR for the guest.  This register identifies the
      address and size of the virtual partition table for the guest, which
      contains information about the nested guests under this guest.
      
      Migrating this value is the only extra requirement for migrating a
      guest which has nested guests (assuming of course that the destination
      host supports nested virtualization in the kvm-hv module).
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      30323418
    • S
      KVM: PPC: Book3S HV: Invalidate TLB when nested vcpu moves physical cpu · 9d0b048d
      Suraj Jitindar Singh 提交于
      This is only done at level 0, since only level 0 knows which physical
      CPU a vcpu is running on.  This does for nested guests what L0 already
      did for its own guests, which is to flush the TLB on a pCPU when it
      goes to run a vCPU there, and there is another vCPU in the same VM
      which previously ran on this pCPU and has now started to run on another
      pCPU.  This is to handle the situation where the other vCPU touched
      a mapping, moved to another pCPU and did a tlbiel (local-only tlbie)
      on that new pCPU and thus left behind a stale TLB entry on this pCPU.
      
      This introduces a limit on the the vcpu_token values used in the
      H_ENTER_NESTED hcall -- they must now be less than NR_CPUS.
      
      [paulus@ozlabs.org - made prev_cpu array be short[] to reduce
       memory consumption.]
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9d0b048d
    • P
      KVM: PPC: Book3S HV: Use hypercalls for TLB invalidation when nested · 690ed4ca
      Paul Mackerras 提交于
      This adds code to call the H_TLB_INVALIDATE hypercall when running as
      a guest, in the cases where we need to invalidate TLBs (or other MMU
      caches) as part of managing the mappings for a nested guest.  Calling
      H_TLB_INVALIDATE lets the nested hypervisor inform the parent
      hypervisor about changes to partition-scoped page tables or the
      partition table without needing to do hypervisor-privileged tlbie
      instructions.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      690ed4ca
    • S
      KVM: PPC: Book3S HV: Implement H_TLB_INVALIDATE hcall · e3b6b466
      Suraj Jitindar Singh 提交于
      When running a nested (L2) guest the guest (L1) hypervisor will use
      the H_TLB_INVALIDATE hcall when it needs to change the partition
      scoped page tables or the partition table which it manages.  It will
      use this hcall in the situations where it would use a partition-scoped
      tlbie instruction if it were running in hypervisor mode.
      
      The H_TLB_INVALIDATE hcall can invalidate different scopes:
      
      Invalidate TLB for a given target address:
      - This invalidates a single L2 -> L1 pte
      - We need to invalidate any L2 -> L0 shadow_pgtable ptes which map the L2
        address space which is being invalidated. This is because a single
        L2 -> L1 pte may have been mapped with more than one pte in the
        L2 -> L0 page tables.
      
      Invalidate the entire TLB for a given LPID or for all LPIDs:
      - Invalidate the entire shadow_pgtable for a given nested guest, or
        for all nested guests.
      
      Invalidate the PWC (page walk cache) for a given LPID or for all LPIDs:
      - We don't cache the PWC, so nothing to do.
      
      Invalidate the entire TLB, PWC and partition table for a given/all LPIDs:
      - Here we re-read the partition table entry and remove the nested state
        for any nested guest for which the first doubleword of the partition
        table entry is now zero.
      
      The H_TLB_INVALIDATE hcall takes as parameters the tlbie instruction
      word (of which only the RIC, PRS and R fields are used), the rS value
      (giving the lpid, where required) and the rB value (giving the IS, AP
      and EPN values).
      
      [paulus@ozlabs.org - adapted to having the partition table in guest
      memory, added the H_TLB_INVALIDATE implementation, removed tlbie
      instruction emulation, reworded the commit message.]
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e3b6b466
    • S
      KVM: PPC: Book3S HV: Introduce rmap to track nested guest mappings · 8cf531ed
      Suraj Jitindar Singh 提交于
      When a host (L0) page which is mapped into a (L1) guest is in turn
      mapped through to a nested (L2) guest we keep a reverse mapping (rmap)
      so that these mappings can be retrieved later.
      
      Whenever we create an entry in a shadow_pgtable for a nested guest we
      create a corresponding rmap entry and add it to the list for the
      L1 guest memslot at the index of the L1 guest page it maps. This means
      at the L1 guest memslot we end up with lists of rmaps.
      
      When we are notified of a host page being invalidated which has been
      mapped through to a (L1) guest, we can then walk the rmap list for that
      guest page, and find and invalidate all of the corresponding
      shadow_pgtable entries.
      
      In order to reduce memory consumption, we compress the information for
      each rmap entry down to 52 bits -- 12 bits for the LPID and 40 bits
      for the guest real page frame number -- which will fit in a single
      unsigned long.  To avoid a scenario where a guest can trigger
      unbounded memory allocations, we scan the list when adding an entry to
      see if there is already an entry with the contents we need.  This can
      occur, because we don't ever remove entries from the middle of a list.
      
      A struct nested guest rmap is a list pointer and an rmap entry;
      ----------------
      | next pointer |
      ----------------
      | rmap entry   |
      ----------------
      
      Thus the rmap pointer for each guest frame number in the memslot can be
      either NULL, a single entry, or a pointer to a list of nested rmap entries.
      
      gfn	 memslot rmap array
       	-------------------------
       0	| NULL			|	(no rmap entry)
       	-------------------------
       1	| single rmap entry	|	(rmap entry with low bit set)
       	-------------------------
       2	| list head pointer	|	(list of rmap entries)
       	-------------------------
      
      The final entry always has the lowest bit set and is stored in the next
      pointer of the last list entry, or as a single rmap entry.
      With a list of rmap entries looking like;
      
      -----------------	-----------------	-------------------------
      | list head ptr	| ----> | next pointer	| ---->	| single rmap entry	|
      -----------------	-----------------	-------------------------
      			| rmap entry	|	| rmap entry		|
      			-----------------	-------------------------
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8cf531ed
    • S
      KVM: PPC: Book3S HV: Handle page fault for a nested guest · fd10be25
      Suraj Jitindar Singh 提交于
      Consider a normal (L1) guest running under the main hypervisor (L0),
      and then a nested guest (L2) running under the L1 guest which is acting
      as a nested hypervisor. L0 has page tables to map the address space for
      L1 providing the translation from L1 real address -> L0 real address;
      
      	L1
      	|
      	| (L1 -> L0)
      	|
      	----> L0
      
      There are also page tables in L1 used to map the address space for L2
      providing the translation from L2 real address -> L1 read address. Since
      the hardware can only walk a single level of page table, we need to
      maintain in L0 a "shadow_pgtable" for L2 which provides the translation
      from L2 real address -> L0 real address. Which looks like;
      
      	L2				L2
      	|				|
      	| (L2 -> L1)			|
      	|				|
      	----> L1			| (L2 -> L0)
      	      |				|
      	      | (L1 -> L0)		|
      	      |				|
      	      ----> L0			--------> L0
      
      When a page fault occurs while running a nested (L2) guest we need to
      insert a pte into this "shadow_pgtable" for the L2 -> L0 mapping. To
      do this we need to:
      
      1. Walk the pgtable in L1 memory to find the L2 -> L1 mapping, and
         provide a page fault to L1 if this mapping doesn't exist.
      2. Use our L1 -> L0 pgtable to convert this L1 address to an L0 address,
         or try to insert a pte for that mapping if it doesn't exist.
      3. Now we have a L2 -> L0 mapping, insert this into our shadow_pgtable
      
      Once this mapping exists we can take rc faults when hardware is unable
      to automatically set the reference and change bits in the pte. On these
      we need to:
      
      1. Check the rc bits on the L2 -> L1 pte match, and otherwise reflect
         the fault down to L1.
      2. Set the rc bits in the L1 -> L0 pte which corresponds to the same
         host page.
      3. Set the rc bits in the L2 -> L0 pte.
      
      As we reuse a large number of functions in book3s_64_mmu_radix.c for
      this we also needed to refactor a number of these functions to take
      an lpid parameter so that the correct lpid is used for tlb invalidations.
      The functionality however has remained the same.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fd10be25
    • P
      KVM: PPC: Book3S HV: Handle hypercalls correctly when nested · 4bad7779
      Paul Mackerras 提交于
      When we are running as a nested hypervisor, we use a hypercall to
      enter the guest rather than code in book3s_hv_rmhandlers.S.  This means
      that the hypercall handlers listed in hcall_real_table never get called.
      There are some hypercalls that are handled there and not in
      kvmppc_pseries_do_hcall(), which therefore won't get processed for
      a nested guest.
      
      To fix this, we add cases to kvmppc_pseries_do_hcall() to handle those
      hypercalls, with the following exceptions:
      
      - The HPT hypercalls (H_ENTER, H_REMOVE, etc.) are not handled because
        we only support radix mode for nested guests.
      
      - H_CEDE has to be handled specially because the cede logic in
        kvmhv_run_single_vcpu assumes that it has been processed by the time
        that kvmhv_p9_guest_entry() returns.  Therefore we put a special
        case for H_CEDE in kvmhv_p9_guest_entry().
      
      For the XICS hypercalls, if real-mode processing is enabled, then the
      virtual-mode handlers assume that they are being called only to finish
      up the operation.  Therefore we turn off the real-mode flag in the XICS
      code when running as a nested hypervisor.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      4bad7779
    • P
      KVM: PPC: Book3S HV: Nested guest entry via hypercall · 360cae31
      Paul Mackerras 提交于
      This adds a new hypercall, H_ENTER_NESTED, which is used by a nested
      hypervisor to enter one of its nested guests.  The hypercall supplies
      register values in two structs.  Those values are copied by the level 0
      (L0) hypervisor (the one which is running in hypervisor mode) into the
      vcpu struct of the L1 guest, and then the guest is run until an
      interrupt or error occurs which needs to be reported to L1 via the
      hypercall return value.
      
      Currently this assumes that the L0 and L1 hypervisors are the same
      endianness, and the structs passed as arguments are in native
      endianness.  If they are of different endianness, the version number
      check will fail and the hcall will be rejected.
      
      Nested hypervisors do not support indep_threads_mode=N, so this adds
      code to print a warning message if the administrator has set
      indep_threads_mode=N, and treat it as Y.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      360cae31
    • P
      KVM: PPC: Book3S HV: Framework and hcall stubs for nested virtualization · 8e3f5fc1
      Paul Mackerras 提交于
      This starts the process of adding the code to support nested HV-style
      virtualization.  It defines a new H_SET_PARTITION_TABLE hypercall which
      a nested hypervisor can use to set the base address and size of a
      partition table in its memory (analogous to the PTCR register).
      On the host (level 0 hypervisor) side, the H_SET_PARTITION_TABLE
      hypercall from the guest is handled by code that saves the virtual
      PTCR value for the guest.
      
      This also adds code for creating and destroying nested guests and for
      reading the partition table entry for a nested guest from L1 memory.
      Each nested guest has its own shadow LPID value, different in general
      from the LPID value used by the nested hypervisor to refer to it.  The
      shadow LPID value is allocated at nested guest creation time.
      
      Nested hypervisor functionality is only available for a radix guest,
      which therefore means a radix host on a POWER9 (or later) processor.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8e3f5fc1
    • S
      KVM: PPC: Book3S HV: Make kvmppc_mmu_radix_xlate process/partition table agnostic · 9811c78e
      Suraj Jitindar Singh 提交于
      kvmppc_mmu_radix_xlate() is used to translate an effective address
      through the process tables. The process table and partition tables have
      identical layout. Exploit this fact to make the kvmppc_mmu_radix_xlate()
      function able to translate either an effective address through the
      process tables or a guest real address through the partition tables.
      
      [paulus@ozlabs.org - reduced diffs from previous code]
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NSuraj Jitindar Singh <sjitindarsingh@gmail.com>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9811c78e
    • P
      KVM: PPC: Use ccr field in pt_regs struct embedded in vcpu struct · fd0944ba
      Paul Mackerras 提交于
      When the 'regs' field was added to struct kvm_vcpu_arch, the code
      was changed to use several of the fields inside regs (e.g., gpr, lr,
      etc.) but not the ccr field, because the ccr field in struct pt_regs
      is 64 bits on 64-bit platforms, but the cr field in kvm_vcpu_arch is
      only 32 bits.  This changes the code to use the regs.ccr field
      instead of cr, and changes the assembly code on 64-bit platforms to
      use 64-bit loads and stores instead of 32-bit ones.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fd0944ba
    • P
      KVM: PPC: Book3S HV: Add a debugfs file to dump radix mappings · 9a94d3ee
      Paul Mackerras 提交于
      This adds a file called 'radix' in the debugfs directory for the
      guest, which when read gives all of the valid leaf PTEs in the
      partition-scoped radix tree for a radix guest, in human-readable
      format.  It is analogous to the existing 'htab' file which dumps
      the HPT entries for a HPT guest.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      9a94d3ee
    • P
      KVM: PPC: Book3S HV: Handle hypervisor instruction faults better · 32eb150a
      Paul Mackerras 提交于
      Currently the code for handling hypervisor instruction page faults
      passes 0 for the flags indicating the type of fault, which is OK in
      the usual case that the page is not mapped in the partition-scoped
      page tables.  However, there are other causes for hypervisor
      instruction page faults, such as not being to update a reference
      (R) or change (C) bit.  The cause is indicated in bits in HSRR1,
      including a bit which indicates that the fault is due to not being
      able to write to a page (for example to update an R or C bit).
      Not handling these other kinds of faults correctly can lead to a
      loop of continual faults without forward progress in the guest.
      
      In order to handle these faults better, this patch constructs a
      "DSISR-like" value from the bits which DSISR and SRR1 (for a HISI)
      have in common, and passes it to kvmppc_book3s_hv_page_fault() so
      that it knows what caused the fault.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      32eb150a
    • P
      KVM: PPC: Book3S HV: Streamlined guest entry/exit path on P9 for radix guests · 95a6432c
      Paul Mackerras 提交于
      This creates an alternative guest entry/exit path which is used for
      radix guests on POWER9 systems when we have indep_threads_mode=Y.  In
      these circumstances there is exactly one vcpu per vcore and there is
      no coordination required between vcpus or vcores; the vcpu can enter
      the guest without needing to synchronize with anything else.
      
      The new fast path is implemented almost entirely in C in book3s_hv.c
      and runs with the MMU on until the guest is entered.  On guest exit
      we use the existing path until the point where we are committed to
      exiting the guest (as distinct from handling an interrupt in the
      low-level code and returning to the guest) and we have pulled the
      guest context from the XIVE.  At that point we check a flag in the
      stack frame to see whether we came in via the old path and the new
      path; if we came in via the new path then we go back to C code to do
      the rest of the process of saving the guest context and restoring the
      host context.
      
      The C code is split into separate functions for handling the
      OS-accessible state and the hypervisor state, with the idea that the
      latter can be replaced by a hypercall when we implement nested
      virtualization.
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      [mpe: Fix CONFIG_ALTIVEC=n build]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      95a6432c
    • P
      KVM: PPC: Book3S: Rework TM save/restore code and make it C-callable · 7854f754
      Paul Mackerras 提交于
      This adds a parameter to __kvmppc_save_tm and __kvmppc_restore_tm
      which allows the caller to indicate whether it wants the nonvolatile
      register state to be preserved across the call, as required by the C
      calling conventions.  This parameter being non-zero also causes the
      MSR bits that enable TM, FP, VMX and VSX to be preserved.  The
      condition register and DSCR are now always preserved.
      
      With this, kvmppc_save_tm_hv and kvmppc_restore_tm_hv can be called
      from C code provided the 3rd parameter is non-zero.  So that these
      functions can be called from modules, they now include code to set
      the TOC pointer (r2) on entry, as they can call other built-in C
      functions which will assume the TOC to have been set.
      
      Also, the fake suspend code in kvmppc_save_tm_hv is modified here to
      assume that treclaim in fake-suspend state does not modify any registers,
      which is the case on POWER9.  This enables the code to be simplified
      quite a bit.
      
      _kvmppc_save_tm_pr and _kvmppc_restore_tm_pr become much simpler with
      this change, since they now only need to save and restore TAR and pass
      1 for the 3rd argument to __kvmppc_{save,restore}_tm.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      7854f754
    • P
      KVM: PPC: Book3S HV: Extract PMU save/restore operations as C-callable functions · 41f4e631
      Paul Mackerras 提交于
      This pulls out the assembler code that is responsible for saving and
      restoring the PMU state for the host and guest into separate functions
      so they can be used from an alternate entry path.  The calling
      convention is made compatible with C.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Reviewed-by: NMadhavan Srinivasan <maddy@linux.vnet.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      41f4e631
    • P
      KVM: PPC: Book3S HV: Move interrupt delivery on guest entry to C code · f7035ce9
      Paul Mackerras 提交于
      This is based on a patch by Suraj Jitindar Singh.
      
      This moves the code in book3s_hv_rmhandlers.S that generates an
      external, decrementer or privileged doorbell interrupt just before
      entering the guest to C code in book3s_hv_builtin.c.  This is to
      make future maintenance and modification easier.  The algorithm
      expressed in the C code is almost identical to the previous
      algorithm.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f7035ce9
    • P
      KVM: PPC: Book3S: Simplify external interrupt handling · d24ea8a7
      Paul Mackerras 提交于
      Currently we use two bits in the vcpu pending_exceptions bitmap to
      indicate that an external interrupt is pending for the guest, one
      for "one-shot" interrupts that are cleared when delivered, and one
      for interrupts that persist until cleared by an explicit action of
      the OS (e.g. an acknowledge to an interrupt controller).  The
      BOOK3S_IRQPRIO_EXTERNAL bit is used for one-shot interrupt requests
      and BOOK3S_IRQPRIO_EXTERNAL_LEVEL is used for persisting interrupts.
      
      In practice BOOK3S_IRQPRIO_EXTERNAL never gets used, because our
      Book3S platforms generally, and pseries in particular, expect
      external interrupt requests to persist until they are acknowledged
      at the interrupt controller.  That combined with the confusion
      introduced by having two bits for what is essentially the same thing
      makes it attractive to simplify things by only using one bit.  This
      patch does that.
      
      With this patch there is only BOOK3S_IRQPRIO_EXTERNAL, and by default
      it has the semantics of a persisting interrupt.  In order to avoid
      breaking the ABI, we introduce a new "external_oneshot" flag which
      preserves the behaviour of the KVM_INTERRUPT ioctl with the
      KVM_INTERRUPT_SET argument.
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d24ea8a7
    • A
      KVM: PPC: Remove redundand permission bits removal · a3ac077b
      Alexey Kardashevskiy 提交于
      The kvmppc_gpa_to_ua() helper itself takes care of the permission
      bits in the TCE and yet every single caller removes them.
      
      This changes semantics of kvmppc_gpa_to_ua() so it takes TCEs
      (which are GPAs + TCE permission bits) to make the callers simpler.
      
      This should cause no behavioural change.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      a3ac077b
    • A
      KVM: PPC: Validate TCEs against preregistered memory page sizes · 42de7b9e
      Alexey Kardashevskiy 提交于
      The userspace can request an arbitrary supported page size for a DMA
      window and this works fine as long as the mapped memory is backed with
      the pages of the same or bigger size; if this is not the case,
      mm_iommu_ua_to_hpa{_rm}() fail and tables do not populated with
      dangerously incorrect TCEs.
      
      However since it is quite easy to misconfigure the KVM and we do not do
      reverts to all changes made to TCE tables if an error happens in a middle,
      we better do the acceptable page size validation before we even touch
      the tables.
      
      This enhances kvmppc_tce_validate() to check the hardware IOMMU page sizes
      against the preregistered memory page sizes.
      
      Since the new check uses real/virtual mode helpers, this renames
      kvmppc_tce_validate() to kvmppc_rm_tce_validate() to handle the real mode
      case and mirrors it for the virtual mode under the old name. The real
      mode handler is not used for the virtual mode as:
      1. it uses _lockless() list traversing primitives instead of RCU;
      2. realmode's mm_iommu_ua_to_hpa_rm() uses vmalloc_to_phys() which
      virtual mode does not have to use and since on POWER9+radix only virtual
      mode handlers actually work, we do not want to slow down that path even
      a bit.
      
      This removes EXPORT_SYMBOL_GPL(kvmppc_tce_validate) as the validators
      are static now.
      
      From now on the attempts on mapping IOMMU pages bigger than allowed
      will result in KVM exit.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      [mpe: Fix KVM_HV=n build]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      42de7b9e
  2. 12 9月, 2018 1 次提交
    • A
      KVM: PPC: Avoid marking DMA-mapped pages dirty in real mode · 425333bf
      Alexey Kardashevskiy 提交于
      At the moment the real mode handler of H_PUT_TCE calls iommu_tce_xchg_rm()
      which in turn reads the old TCE and if it was a valid entry, marks
      the physical page dirty if it was mapped for writing. Since it is in
      real mode, realmode_pfn_to_page() is used instead of pfn_to_page()
      to get the page struct. However SetPageDirty() itself reads the compound
      page head and returns a virtual address for the head page struct and
      setting dirty bit for that kills the system.
      
      This adds additional dirty bit tracking into the MM/IOMMU API for use
      in the real mode. Note that this does not change how VFIO and
      KVM (in virtual mode) set this bit. The KVM (real mode) changes include:
      - use the lowest bit of the cached host phys address to carry
      the dirty bit;
      - mark pages dirty when they are unpinned which happens when
      the preregistered memory is released which always happens in virtual
      mode;
      - add mm_iommu_ua_mark_dirty_rm() helper to set delayed dirty bit;
      - change iommu_tce_xchg_rm() to take the kvm struct for the mm to use
      in the new mm_iommu_ua_mark_dirty_rm() helper;
      - move iommu_tce_xchg_rm() to book3s_64_vio_hv.c (which is the only
      caller anyway) to reduce the real mode KVM and IOMMU knowledge
      across different subsystems.
      
      This removes realmode_pfn_to_page() as it is not used anymore.
      
      While we at it, remove some EXPORT_SYMBOL_GPL() as that code is for
      the real mode only and modules cannot call it anyway.
      Signed-off-by: NAlexey Kardashevskiy <aik@ozlabs.ru>
      Reviewed-by: NDavid Gibson <david@gibson.dropbear.id.au>
      Signed-off-by: NPaul Mackerras <paulus@ozlabs.org>
      425333bf
  3. 23 8月, 2018 2 次提交
    • A
      powerpc/mm/books3s: Add new pte bit to mark pte temporarily invalid. · bd0dbb73
      Aneesh Kumar K.V 提交于
      When splitting a huge pmd pte, we need to mark the pmd entry invalid. We
      can do that by clearing _PAGE_PRESENT bit. But then that will be taken as a
      swap pte. In order to differentiate between the two use a software pte bit
      when invalidating.
      
      For regular pte, due to bd5050e3 ("powerpc/mm/radix: Change pte relax
      sequence to handle nest MMU hang") we need to mark the pte entry invalid when
      relaxing access permission. Instead of marking pte_none which can result in
      different page table walk routines possibly skipping this pte entry, invalidate
      it but still keep it marked present.
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bd0dbb73
    • C
      powerpc/nohash: fix pte_access_permitted() · 810e9f86
      Christophe Leroy 提交于
      Commit 5769beaf ("powerpc/mm: Add proper pte access check helper
      for other platforms") replaced generic pte_access_permitted() by an
      arch specific one.
      
      The generic one is defined as
      (pte_present(pte) && (!(write) || pte_write(pte)))
      
      The arch specific one is open coded checking that _PAGE_USER and
      _PAGE_WRITE (_PAGE_RW) flags are set, but lacking to check that
      _PAGE_RO and _PAGE_PRIVILEGED are unset, leading to a useless test
      on targets like the 8xx which defines _PAGE_RW and _PAGE_USER as 0.
      
      Commit 5fa5b16b ("powerpc/mm/hugetlb: Use pte_access_permitted
      for hugetlb access check") replaced some tests performed with
      pte helpers by a call to pte_access_permitted(), leading to the same
      issue.
      
      This patch rewrites powerpc/nohash pte_access_permitted()
      using pte helpers.
      
      Fixes: 5769beaf ("powerpc/mm: Add proper pte access check helper for other platforms")
      Fixes: 5fa5b16b ("powerpc/mm/hugetlb: Use pte_access_permitted for hugetlb access check")
      Cc: stable@vger.kernel.org # v4.15+
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Reviewed-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      810e9f86
  4. 21 8月, 2018 1 次提交
    • S
      powerpc/topology: Get topology for shared processors at boot · 2ea62630
      Srikar Dronamraju 提交于
      On a shared LPAR, Phyp will not update the CPU associativity at boot
      time. Just after the boot system does recognize itself as a shared
      LPAR and trigger a request for correct CPU associativity. But by then
      the scheduler would have already created/destroyed its sched domains.
      
      This causes
        - Broken load balance across Nodes causing islands of cores.
        - Performance degradation esp if the system is lightly loaded
        - dmesg to wrongly report all CPUs to be in Node 0.
        - Messages in dmesg saying borken topology.
        - With commit 051f3ca0 ("sched/topology: Introduce NUMA identity
          node sched domain"), can cause rcu stalls at boot up.
      
      The sched_domains_numa_masks table which is used to generate cpumasks
      is only created at boot time just before creating sched domains and
      never updated. Hence, its better to get the topology correct before
      the sched domains are created.
      
      For example on 64 core Power 8 shared LPAR, dmesg reports
      
        Brought up 512 CPUs
        Node 0 CPUs: 0-511
        Node 1 CPUs:
        Node 2 CPUs:
        Node 3 CPUs:
        Node 4 CPUs:
        Node 5 CPUs:
        Node 6 CPUs:
        Node 7 CPUs:
        Node 8 CPUs:
        Node 9 CPUs:
        Node 10 CPUs:
        Node 11 CPUs:
        ...
        BUG: arch topology borken
             the DIE domain not a subset of the NUMA domain
        BUG: arch topology borken
             the DIE domain not a subset of the NUMA domain
      
      numactl/lscpu output will still be correct with cores spreading across
      all nodes:
      
        Socket(s):             64
        NUMA node(s):          12
        Model:                 2.0 (pvr 004d 0200)
        Model name:            POWER8 (architected), altivec supported
        Hypervisor vendor:     pHyp
        Virtualization type:   para
        L1d cache:             64K
        L1i cache:             32K
        NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,176-183,272-279,368-375,464-471
        NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,184-191,280-287,376-383,472-479
        NUMA node2 CPU(s): 16-23,48-55,80-87,112-119,192-199,288-295,384-391,480-487
        NUMA node3 CPU(s): 24-31,56-63,88-95,120-127,200-207,296-303,392-399,488-495
        NUMA node4 CPU(s):     208-215,304-311,400-407,496-503
        NUMA node5 CPU(s):     168-175,264-271,360-367,456-463
        NUMA node6 CPU(s):     128-135,224-231,320-327,416-423
        NUMA node7 CPU(s):     136-143,232-239,328-335,424-431
        NUMA node8 CPU(s):     216-223,312-319,408-415,504-511
        NUMA node9 CPU(s):     144-151,240-247,336-343,432-439
        NUMA node10 CPU(s):    152-159,248-255,344-351,440-447
        NUMA node11 CPU(s):    160-167,256-263,352-359,448-455
      
      Currently on this LPAR, the scheduler detects 2 levels of Numa and
      created numa sched domains for all CPUs, but it finds a single DIE
      domain consisting of all CPUs. Hence it deletes all numa sched
      domains.
      
      To address this, detect the shared processor and update topology soon
      after CPUs are setup so that correct topology is updated just before
      scheduler creates sched domain.
      
      With the fix, dmesg reports:
      
        numa: Node 0 CPUs: 0-7 32-39 64-71 96-103 176-183 272-279 368-375 464-471
        numa: Node 1 CPUs: 8-15 40-47 72-79 104-111 184-191 280-287 376-383 472-479
        numa: Node 2 CPUs: 16-23 48-55 80-87 112-119 192-199 288-295 384-391 480-487
        numa: Node 3 CPUs: 24-31 56-63 88-95 120-127 200-207 296-303 392-399 488-495
        numa: Node 4 CPUs: 208-215 304-311 400-407 496-503
        numa: Node 5 CPUs: 168-175 264-271 360-367 456-463
        numa: Node 6 CPUs: 128-135 224-231 320-327 416-423
        numa: Node 7 CPUs: 136-143 232-239 328-335 424-431
        numa: Node 8 CPUs: 216-223 312-319 408-415 504-511
        numa: Node 9 CPUs: 144-151 240-247 336-343 432-439
        numa: Node 10 CPUs: 152-159 248-255 344-351 440-447
        numa: Node 11 CPUs: 160-167 256-263 352-359 448-455
      
      and lscpu also reports:
      
        Socket(s):             64
        NUMA node(s):          12
        Model:                 2.0 (pvr 004d 0200)
        Model name:            POWER8 (architected), altivec supported
        Hypervisor vendor:     pHyp
        Virtualization type:   para
        L1d cache:             64K
        L1i cache:             32K
        NUMA node0 CPU(s): 0-7,32-39,64-71,96-103,176-183,272-279,368-375,464-471
        NUMA node1 CPU(s): 8-15,40-47,72-79,104-111,184-191,280-287,376-383,472-479
        NUMA node2 CPU(s): 16-23,48-55,80-87,112-119,192-199,288-295,384-391,480-487
        NUMA node3 CPU(s): 24-31,56-63,88-95,120-127,200-207,296-303,392-399,488-495
        NUMA node4 CPU(s):     208-215,304-311,400-407,496-503
        NUMA node5 CPU(s):     168-175,264-271,360-367,456-463
        NUMA node6 CPU(s):     128-135,224-231,320-327,416-423
        NUMA node7 CPU(s):     136-143,232-239,328-335,424-431
        NUMA node8 CPU(s):     216-223,312-319,408-415,504-511
        NUMA node9 CPU(s):     144-151,240-247,336-343,432-439
        NUMA node10 CPU(s):    152-159,248-255,344-351,440-447
        NUMA node11 CPU(s):    160-167,256-263,352-359,448-455
      Reported-by: NManjunatha H R <manjuhr1@in.ibm.com>
      Signed-off-by: NSrikar Dronamraju <srikar@linux.vnet.ibm.com>
      [mpe: Trim / format change log]
      Tested-by: NMichael Ellerman <mpe@ellerman.id.au>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      2ea62630
  5. 20 8月, 2018 1 次提交
  6. 18 8月, 2018 1 次提交
    • S
      mm: convert return type of handle_mm_fault() caller to vm_fault_t · 50a7ca3c
      Souptick Joarder 提交于
      Use new return type vm_fault_t for fault handler.  For now, this is just
      documenting that the function returns a VM_FAULT value rather than an
      errno.  Once all instances are converted, vm_fault_t will become a
      distinct type.
      
      Ref-> commit 1c8f4220 ("mm: change return type to vm_fault_t")
      
      In this patch all the caller of handle_mm_fault() are changed to return
      vm_fault_t type.
      
      Link: http://lkml.kernel.org/r/20180617084810.GA6730@jordon-HP-15-Notebook-PCSigned-off-by: NSouptick Joarder <jrdr.linux@gmail.com>
      Cc: Matthew Wilcox <willy@infradead.org>
      Cc: Richard Henderson <rth@twiddle.net>
      Cc: Tony Luck <tony.luck@intel.com>
      Cc: Matt Turner <mattst88@gmail.com>
      Cc: Vineet Gupta <vgupta@synopsys.com>
      Cc: Russell King <linux@armlinux.org.uk>
      Cc: Catalin Marinas <catalin.marinas@arm.com>
      Cc: Will Deacon <will.deacon@arm.com>
      Cc: Richard Kuo <rkuo@codeaurora.org>
      Cc: Geert Uytterhoeven <geert@linux-m68k.org>
      Cc: Michal Simek <monstr@monstr.eu>
      Cc: James Hogan <jhogan@kernel.org>
      Cc: Ley Foon Tan <lftan@altera.com>
      Cc: Jonas Bonn <jonas@southpole.se>
      Cc: James E.J. Bottomley <jejb@parisc-linux.org>
      Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
      Cc: Palmer Dabbelt <palmer@sifive.com>
      Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Richard Weinberger <richard@nod.at>
      Cc: Guan Xuetao <gxt@pku.edu.cn>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: "H. Peter Anvin" <hpa@zytor.com>
      Cc: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
      Signed-off-by: NAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: NLinus Torvalds <torvalds@linux-foundation.org>
      50a7ca3c
  7. 13 8月, 2018 1 次提交
  8. 10 8月, 2018 8 次提交
    • M
      powerpc/uaccess: Enable get_user(u64, *p) on 32-bit · f7a6947c
      Michael Ellerman 提交于
      Currently if you build a 32-bit powerpc kernel and use get_user() to
      load a u64 value it will fail to build with eg:
      
        kernel/rseq.o: In function `rseq_get_rseq_cs':
        kernel/rseq.c:123: undefined reference to `__get_user_bad'
      
      This is hitting the check in __get_user_size() that makes sure the
      size we're copying doesn't exceed the size of the destination:
      
        #define __get_user_size(x, ptr, size, retval)
        do {
        	retval = 0;
        	__chk_user_ptr(ptr);
        	if (size > sizeof(x))
        		(x) = __get_user_bad();
      
      Which doesn't immediately make sense because the size of the
      destination is u64, but it's not really, because __get_user_check()
      etc. internally create an unsigned long and copy into that:
      
        #define __get_user_check(x, ptr, size)
        ({
        	long __gu_err = -EFAULT;
        	unsigned long  __gu_val = 0;
      
      The problem being that on 32-bit unsigned long is not big enough to
      hold a u64. We can fix this with a trick from hpa in the x86 code, we
      statically check the type of x and set the type of __gu_val to either
      unsigned long or unsigned long long.
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f7a6947c
    • A
      powerpc/mm/hash: Remove unnecessary do { } while(0) loop · f405b510
      Aneesh Kumar K.V 提交于
      Avoid coverity false warnings like:
      
        *** CID 187347:  Control flow issues  (UNREACHABLE)
        /arch/powerpc/mm/hash_native_64.c: 819 in native_flush_hash_range()
        813        slot += hidx & _PTEIDX_GROUP_IX;
        814        hptep = htab_address + slot;
        815        want_v = hpte_encode_avpn(vpn, psize, ssize);
        816        hpte_v = hpte_get_old_v(hptep);
        817
        818        if (!HPTE_V_COMPARE(hpte_v, want_v) || !(hpte_v & HPTE_V_VALID))
        >>>     CID 187347:  Control flow issues  (UNREACHABLE)
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      f405b510
    • N
      powerpc/64s: move machine check SLB flushing to mm/slb.c · e7e81847
      Nicholas Piggin 提交于
      The machine check code that flushes and restores bolted segments in
      real mode belongs in mm/slb.c. This will also be used by pseries
      machine check and idle code in future changes.
      Signed-off-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      e7e81847
    • A
      powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range · 0b6aa1a2
      Aneesh Kumar K.V 提交于
      This patch makes sure we update the mmu_gather page size even if we are
      requesting for a fullmm flush. This avoids triggering VM_WARN_ON in code
      paths like __tlb_remove_page_size that explicitly check for removing range page
      size to be same as mmu gather page size.
      
      Fixes: 5a609934 ("powerpc/64s/radix: tlb do not flush on page size when fullmm")
      Signed-off-by: NAneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
      Acked-by: NNicholas Piggin <npiggin@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      0b6aa1a2
    • C
      powerpc/mm: remove huge_pte_offset_and_shift() prototype · 646dbe40
      Christophe Leroy 提交于
      huge_pte_offset_and_shift() has never existed
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      646dbe40
    • C
      powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled · fa54a981
      Christophe Leroy 提交于
      The symbol memcpy_nocache_branch defined in order to allow patching
      of memset function once cache is enabled leads to confusing reports
      by perf tool.
      
      Using the new patch_site functionality solves this issue.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      fa54a981
    • H
      powerpc/fadump: handle crash memory ranges array index overflow · 1bd6a1c4
      Hari Bathini 提交于
      Crash memory ranges is an array of memory ranges of the crashing kernel
      to be exported as a dump via /proc/vmcore file. The size of the array
      is set based on INIT_MEMBLOCK_REGIONS, which works alright in most cases
      where memblock memory regions count is less than INIT_MEMBLOCK_REGIONS
      value. But this count can grow beyond INIT_MEMBLOCK_REGIONS value since
      commit 142b45a7 ("memblock: Add array resizing support").
      
      On large memory systems with a few DLPAR operations, the memblock memory
      regions count could be larger than INIT_MEMBLOCK_REGIONS value. On such
      systems, registering fadump results in crash or other system failures
      like below:
      
        task: c00007f39a290010 ti: c00000000b738000 task.ti: c00000000b738000
        NIP: c000000000047df4 LR: c0000000000f9e58 CTR: c00000000010f180
        REGS: c00000000b73b570 TRAP: 0300   Tainted: G          L   X  (4.4.140+)
        MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE>  CR: 22004484  XER: 20000000
        CFAR: c000000000008500 DAR: 000007a450000000 DSISR: 40000000 SOFTE: 0
        ...
        NIP [c000000000047df4] smp_send_reschedule+0x24/0x80
        LR [c0000000000f9e58] resched_curr+0x138/0x160
        Call Trace:
          resched_curr+0x138/0x160 (unreliable)
          check_preempt_curr+0xc8/0xf0
          ttwu_do_wakeup+0x38/0x150
          try_to_wake_up+0x224/0x4d0
          __wake_up_common+0x94/0x100
          ep_poll_callback+0xac/0x1c0
          __wake_up_common+0x94/0x100
          __wake_up_sync_key+0x70/0xa0
          sock_def_readable+0x58/0xa0
          unix_stream_sendmsg+0x2dc/0x4c0
          sock_sendmsg+0x68/0xa0
          ___sys_sendmsg+0x2cc/0x2e0
          __sys_sendmsg+0x5c/0xc0
          SyS_socketcall+0x36c/0x3f0
          system_call+0x3c/0x100
      
      as array index overflow is not checked for while setting up crash memory
      ranges causing memory corruption. To resolve this issue, dynamically
      allocate memory for crash memory ranges and resize it incrementally,
      in units of pagesize, on hitting array size limit.
      
      Fixes: 2df173d9 ("fadump: Initialize elfcore header and add PT_LOAD program headers.")
      Cc: stable@vger.kernel.org # v3.4+
      Signed-off-by: NHari Bathini <hbathini@linux.ibm.com>
      Reviewed-by: NMahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
      [mpe: Just use PAGE_SIZE directly, fixup variable placement]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      1bd6a1c4
    • C
      powerpc/cpm1: fix compilation error with CONFIG_PPC_EARLY_DEBUG_CPM · 6bd6d867
      Christophe Leroy 提交于
      commit e8cb7a55 ("powerpc: remove superflous inclusions of
      asm/fixmap.h") removed inclusion of asm/fixmap.h from files not
      including objects from that file.
      
      However, asm/mmu-8xx.h includes  call to __fix_to_virt(). The proper
      way would be to include asm/fixmap.h in asm/mmu-8xx.h but it creates
      an inclusion loop.
      
      So we have to leave asm/fixmap.h in sysdep/cpm_common.c for
      CONFIG_PPC_EARLY_DEBUG_CPM
      
        CC      arch/powerpc/sysdev/cpm_common.o
      In file included from ./arch/powerpc/include/asm/mmu.h:340:0,
                       from ./arch/powerpc/include/asm/reg_8xx.h:8,
                       from ./arch/powerpc/include/asm/reg.h:29,
                       from ./arch/powerpc/include/asm/processor.h:13,
                       from ./arch/powerpc/include/asm/thread_info.h:28,
                       from ./include/linux/thread_info.h:38,
                       from ./arch/powerpc/include/asm/ptrace.h:159,
                       from ./arch/powerpc/include/asm/hw_irq.h:12,
                       from ./arch/powerpc/include/asm/irqflags.h:12,
                       from ./include/linux/irqflags.h:16,
                       from ./include/asm-generic/cmpxchg-local.h:6,
                       from ./arch/powerpc/include/asm/cmpxchg.h:537,
                       from ./arch/powerpc/include/asm/atomic.h:11,
                       from ./include/linux/atomic.h:5,
                       from ./include/linux/mutex.h:18,
                       from ./include/linux/kernfs.h:13,
                       from ./include/linux/sysfs.h:16,
                       from ./include/linux/kobject.h:20,
                       from ./include/linux/device.h:16,
                       from ./include/linux/node.h:18,
                       from ./include/linux/cpu.h:17,
                       from ./include/linux/of_device.h:5,
                       from arch/powerpc/sysdev/cpm_common.c:21:
      arch/powerpc/sysdev/cpm_common.c: In function ‘udbg_init_cpm’:
      ./arch/powerpc/include/asm/mmu-8xx.h:218:25: error: implicit declaration of function ‘__fix_to_virt’ [-Werror=implicit-function-declaration]
       #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
                               ^
      arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’
             VIRT_IMMR_BASE);
             ^
      ./arch/powerpc/include/asm/mmu-8xx.h:218:39: error: ‘FIX_IMMR_BASE’ undeclared (first use in this function)
       #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
                                             ^
      arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’
             VIRT_IMMR_BASE);
             ^
      ./arch/powerpc/include/asm/mmu-8xx.h:218:39: note: each undeclared identifier is reported only once for each function it appears in
       #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
                                             ^
      arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’
             VIRT_IMMR_BASE);
             ^
      cc1: all warnings being treated as errors
      make[1]: *** [arch/powerpc/sysdev/cpm_common.o] Error 1
      
      Fixes: e8cb7a55 ("powerpc: remove superflous inclusions of asm/fixmap.h")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6bd6d867
  9. 09 8月, 2018 1 次提交
    • C
      powerpc/cpm1: fix compilation error with CONFIG_PPC_EARLY_DEBUG_CPM · 2a39926c
      Christophe Leroy 提交于
      commit e8cb7a55 ("powerpc: remove superflous inclusions of
      asm/fixmap.h") removed inclusion of asm/fixmap.h from files not
      including objects from that file.
      
      However, asm/mmu-8xx.h includes  call to __fix_to_virt(). The proper
      way would be to include asm/fixmap.h in asm/mmu-8xx.h but it creates
      an inclusion loop.
      
      So we have to leave asm/fixmap.h in sysdep/cpm_common.c for
      CONFIG_PPC_EARLY_DEBUG_CPM
      
        CC      arch/powerpc/sysdev/cpm_common.o
      In file included from ./arch/powerpc/include/asm/mmu.h:340:0,
                       from ./arch/powerpc/include/asm/reg_8xx.h:8,
                       from ./arch/powerpc/include/asm/reg.h:29,
                       from ./arch/powerpc/include/asm/processor.h:13,
                       from ./arch/powerpc/include/asm/thread_info.h:28,
                       from ./include/linux/thread_info.h:38,
                       from ./arch/powerpc/include/asm/ptrace.h:159,
                       from ./arch/powerpc/include/asm/hw_irq.h:12,
                       from ./arch/powerpc/include/asm/irqflags.h:12,
                       from ./include/linux/irqflags.h:16,
                       from ./include/asm-generic/cmpxchg-local.h:6,
                       from ./arch/powerpc/include/asm/cmpxchg.h:537,
                       from ./arch/powerpc/include/asm/atomic.h:11,
                       from ./include/linux/atomic.h:5,
                       from ./include/linux/mutex.h:18,
                       from ./include/linux/kernfs.h:13,
                       from ./include/linux/sysfs.h:16,
                       from ./include/linux/kobject.h:20,
                       from ./include/linux/device.h:16,
                       from ./include/linux/node.h:18,
                       from ./include/linux/cpu.h:17,
                       from ./include/linux/of_device.h:5,
                       from arch/powerpc/sysdev/cpm_common.c:21:
      arch/powerpc/sysdev/cpm_common.c: In function ‘udbg_init_cpm’:
      ./arch/powerpc/include/asm/mmu-8xx.h:218:25: error: implicit declaration of function ‘__fix_to_virt’ [-Werror=implicit-function-declaration]
       #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
                               ^
      arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’
             VIRT_IMMR_BASE);
             ^
      ./arch/powerpc/include/asm/mmu-8xx.h:218:39: error: ‘FIX_IMMR_BASE’ undeclared (first use in this function)
       #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
                                             ^
      arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’
             VIRT_IMMR_BASE);
             ^
      ./arch/powerpc/include/asm/mmu-8xx.h:218:39: note: each undeclared identifier is reported only once for each function it appears in
       #define VIRT_IMMR_BASE (__fix_to_virt(FIX_IMMR_BASE))
                                             ^
      arch/powerpc/sysdev/cpm_common.c:75:7: note: in expansion of macro ‘VIRT_IMMR_BASE’
             VIRT_IMMR_BASE);
             ^
      cc1: all warnings being treated as errors
      make[1]: *** [arch/powerpc/sysdev/cpm_common.o] Error 1
      
      Fixes: e8cb7a55 ("powerpc: remove superflous inclusions of asm/fixmap.h")
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      2a39926c
  10. 07 8月, 2018 1 次提交
    • H
      crypto/nx: Initialize 842 high and normal RxFIFO control registers · 656ecc16
      Haren Myneni 提交于
      NX increments readOffset by FIFO size in receive FIFO control register
      when CRB is read. But the index in RxFIFO has to match with the
      corresponding entry in FIFO maintained by VAS in kernel. Otherwise NX
      may be processing incorrect CRBs and can cause CRB timeout.
      
      VAS FIFO offset is 0 when the receive window is opened during
      initialization. When the module is reloaded or in kexec boot, readOffset
      in FIFO control register may not match with VAS entry. This patch adds
      nx_coproc_init OPAL call to reset readOffset and queued entries in FIFO
      control register for both high and normal FIFOs.
      Signed-off-by: NHaren Myneni <haren@us.ibm.com>
      [mpe: Fixup uninitialized variable warning]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      656ecc16