1. 12 3月, 2016 3 次提交
    • C
      powerpc/8xx: move setup_initial_memory_limit() into 8xx_mmu.c · 516d9189
      Christophe Leroy 提交于
      Now we have a 8xx specific .c file for that so put it in there
      as other powerpc variants do
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      516d9189
    • C
      powerpc/8xx: Map linear kernel RAM with 8M pages · a372acfa
      Christophe Leroy 提交于
      On a live running system (VoIP gateway for Air Trafic Control), over
      a 10 minutes period (with 277s idle), we get 87 millions DTLB misses
      and approximatly 35 secondes are spent in DTLB handler.
      This represents 5.8% of the overall time and even 10.8% of the
      non-idle time.
      Among those 87 millions DTLB misses, 15% are on user addresses and
      85% are on kernel addresses. And within the kernel addresses, 93%
      are on addresses from the linear address space and only 7% are on
      addresses from the virtual address space.
      
      MPC8xx has no BATs but it has 8Mb page size. This patch implements
      mapping of kernel RAM using 8Mb pages, on the same model as what is
      done on the 40x.
      
      In 4k pages mode, each PGD entry maps a 4Mb area: we map every two
      entries to the same 8Mb physical page. In each second entry, we add
      4Mb to the page physical address to ease life of the FixupDAR
      routine. This is just ignored by HW.
      
      In 16k pages mode, each PGD entry maps a 64Mb area: each PGD entry
      will point to the first page of the area. The DTLB handler adds
      the 3 bits from EPN to map the correct page.
      
      With this patch applied, we now get only 13 millions TLB misses
      during the 10 minutes period. The idle time has increased to 313s
      and the overall time spent in DTLB miss handler is 6.3s, which
      represents 1% of the overall time and 2.2% of non-idle time.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      a372acfa
    • C
      powerpc/8xx: Save r3 all the time in DTLB miss handler · 913a6b3d
      Christophe Leroy 提交于
      We are spending between 40 and 160 cycles with a mean of 65 cycles in
      the DTLB handling routine (measured with mftbl) so make it more
      simple althought it adds one instruction.
      With this modification, we get three registers available at all time,
      which will help with following patch.
      Signed-off-by: NChristophe Leroy <christophe.leroy@c-s.fr>
      Signed-off-by: NScott Wood <oss@buserror.net>
      913a6b3d
  2. 10 3月, 2016 9 次提交
  3. 05 3月, 2016 14 次提交
  4. 03 3月, 2016 8 次提交
  5. 02 3月, 2016 6 次提交
    • C
      powerpc: Add the ability to save VSX without giving it up · bf6a4d5b
      Cyril Bur 提交于
      This patch adds the ability to be able to save the VSX registers to the
      thread struct without giving up (disabling the facility) next time the
      process returns to userspace.
      
      This patch builds on a previous optimisation for the FPU and VEC registers
      in the thread copy path to avoid a possibly pointless reload of VSX state.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      bf6a4d5b
    • C
      powerpc: Add the ability to save Altivec without giving it up · 6f515d84
      Cyril Bur 提交于
      This patch adds the ability to be able to save the VEC registers to the
      thread struct without giving up (disabling the facility) next time the
      process returns to userspace.
      
      This patch builds on a previous optimisation for the FPU registers in the
      thread copy path to avoid a possibly pointless reload of VEC state.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      6f515d84
    • C
      powerpc: Add the ability to save FPU without giving it up · 8792468d
      Cyril Bur 提交于
      This patch adds the ability to be able to save the FPU registers to the
      thread struct without giving up (disabling the facility) next time the
      process returns to userspace.
      
      This patch optimises the thread copy path (as a result of a fork() or
      clone()) so that the parent thread can return to userspace with hot
      registers avoiding a possibly pointless reload of FPU register state.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      8792468d
    • C
      powerpc: Prepare for splitting giveup_{fpu, altivec, vsx} in two · de2a20aa
      Cyril Bur 提交于
      This prepares for the decoupling of saving {fpu,altivec,vsx} registers and
      marking {fpu,altivec,vsx} as being unused by a thread.
      
      Currently giveup_{fpu,altivec,vsx}() does both however optimisations to
      task switching can be made if these two operations are decoupled.
      save_all() will permit the saving of registers to thread structs and leave
      threads MSR with bits enabled.
      
      This patch introduces no functional change.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      de2a20aa
    • C
      powerpc: Restore FPU/VEC/VSX if previously used · 70fe3d98
      Cyril Bur 提交于
      Currently the FPU, VEC and VSX facilities are lazily loaded. This is not
      a problem unless a process is using these facilities.
      
      Modern versions of GCC are very good at automatically vectorising code,
      new and modernised workloads make use of floating point and vector
      facilities, even the kernel makes use of vectorised memcpy.
      
      All this combined greatly increases the cost of a syscall since the
      kernel uses the facilities sometimes even in syscall fast-path making it
      increasingly common for a thread to take an *_unavailable exception soon
      after a syscall, not to mention potentially taking all three.
      
      The obvious overcompensation to this problem is to simply always load
      all the facilities on every exit to userspace. Loading up all FPU, VEC
      and VSX registers every time can be expensive and if a workload does
      avoid using them, it should not be forced to incur this penalty.
      
      An 8bit counter is used to detect if the registers have been used in the
      past and the registers are always loaded until the value wraps to back
      to zero.
      
      Several versions of the assembly in entry_64.S were tested:
      
        1. Always calling C.
        2. Performing a common case check and then calling C.
        3. A complex check in asm.
      
      After some benchmarking it was determined that avoiding C in the common
      case is a performance benefit (option 2). The full check in asm (option
      3) greatly complicated that codepath for a negligible performance gain
      and the trade-off was deemed not worth it.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      [mpe: Move load_vec in the struct to fill an existing hole, reword change log]
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      
      fixup
      70fe3d98
    • C
      powerpc: Explicitly disable math features when copying thread · d272f667
      Cyril Bur 提交于
      Currently when threads get scheduled off they always giveup the FPU,
      Altivec (VMX) and Vector (VSX) units if they were using them. When they are
      scheduled back on a fault is then taken to enable each facility and load
      registers. As a result explicitly disabling FPU/VMX/VSX has not been
      necessary.
      
      Future changes and optimisations remove this mandatory giveup and fault
      which could cause calls such as clone() and fork() to copy threads and run
      them later with FPU/VMX/VSX enabled but no registers loaded.
      
      This patch starts the process of having MSR_{FP,VEC,VSX} mean that a
      threads registers are hot while not having MSR_{FP,VEC,VSX} means that the
      registers must be loaded. This allows for a smarter return to userspace.
      Signed-off-by: NCyril Bur <cyrilbur@gmail.com>
      Signed-off-by: NMichael Ellerman <mpe@ellerman.id.au>
      d272f667